Vector Functions in R Programming
Introduction to Vector Functions in R Programming
Vectors! One of the most basic but at the same time important data structures in R programming. Vectors are usually one-dimensional data structures in R that allow us to store objects of the same data type (may it be numeric, string, boolean, integer, or complex) in a sequential manner. They have a lot of importance under R programming.
We have seen a bit about vectors under one of my previous articles named Introduction to R Programming: Part 2. If you haven’t checked it yet, click on the link and go further with data structures there.
In this article though, we are about to discuss the functions which are specifically designed for vectors in R programming. We will list out all those functions which can be used over a vector and then check them one by one, in-depth with a handful of examples.
Functions are nothing but a piece of code that allows you to automate certain task, reduce the repetition within the codes, that leads to the increase in efficiency and readability of your code.
What are Vector Functions?
The vector functions are the ones developed specifically to be applied to vectors. These functions allow us to either create a vector or manipulate the vectors in a way we want. The following is the list of common functions we use on vectors under R.
- rep() function
- seq() function
- is.vector() function
- as.vector() function
- any() function
- all() function
- lapply() function
- sapply() function
Let us discuss these functions one by one in detail with some hands-on examples.
The rep() Function in R Programming
If we want to repeat a vector for a certain number of times, we use the rep()
function to get this task done. When you apply the rep()
function on a vector, it replicates the value for the given vector for a given number of times. This is a generic function, meaning it can work on the vector of any input data type (be it string, boolean, numeric, and what not). You know some functions work only on a specific data types. For Ex. the avg()
function only works on vectors with numeric or integer data type.
Syntax for rep() function is as shown below:
rep(x, ...)
Where,
x
– is a vector with any data type (data structure as well, like a list) which we need to repeat....
– specifies the further arguments, which may include the following.
times
– represents a non-negative integer number up-to which we want the vector to repeat.length.out
– represents the length of the output vector. If you have a vector with three elements and you want it to repeat six times, you will mention value for length.out as 6.each
– represents how many times each element of x should repeat.
These are all synonyms and that’s the reason we have “...
” under the function.
Let us see an example for rep()
function. See the code below:
> # rep() function to repeat a vector in R
> rep_vect <- rep(c(1, 5, 9), 2)
> print(rep_vect)
[1] 1 5 9 1 5 9
We can also specify the times
, length.out
, or each
as a separate argument under this function to repeat the input vector.
> rep_vect1 <- rep(c(1, 5, 9), times = 2) #repeating vector two times
> print(rep_vect1)
[1] 1 5 9 1 5 9
> rep_vect2 <- rep(c(1, 5, 9), each = 2) #repeating each element of vector 2 times
> print(rep_vect2)
[1] 1 1 5 5 9 9
> rep_vect3 <- rep(c(1, 5, 9), length.out = 6) #repeating elements of vector until given
length.
> print(rep_vect3)
[1] 1 5 9 1 5 9
The seq() Function in R Programming
The seq()
is a generic function that can be used to generate a sequence of any data type we provide as input under “from
” and generates the sequence until "to
” with a step of specified units measured under “by
” section. This function generates a finite sequence of any of the data types.
syntax for seq()
function is as shown below under R programming.
seq(from=, to=, by=)
where,
from
– is a required argument and specifies the starting point of the sequence.to
– specifies the ending point of the sequence. This is the point up-to which we wanted the sequence to be generated.by
– works as a step or increment function that allows us to specify the increment value in units.
Let us see an example of the seq()
function for a better realization.
#Generating a sequence from 1 to 5 with step 0.5
> seq_1 <- seq(from= 1, to= 5, by= 0.5)
> print(seq_1)
[1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
The step argument “by
“, can also be replaced by the “length.out
” argument which will ideally specify the length of the sequence. Based on the value provided for the “length.out
” argument, the system decides by what units the increment should happen while generating a sequence. See an example code below for your reference.
#Generating a sequence using length.out argument
> seq_2 <- seq(from= 10, to= 12, length.out = 20) #generate 20 elements between 10 and 12
> print(seq_2)
[1] 10.00000 10.10526 10.21053 10.31579 10.42105 10.52632 10.63158
[8] 10.73684 10.84211 10.94737 11.05263 11.15789 11.26316 11.36842
[15] 11.47368 11.57895 11.68421 11.78947 11.89474 12.00000
Within this sequence, we have used “length.out
” argument to generate a sequence with 20 numbers starting from 10 and ending at 12.
The is.vector() Function in R Programming
There comes the situation where we need to check if the given object is a vector or not, we have a function for that in R programming.
We can use is.vector() function in R that takes an object as an argument and returns “TRUE
” if the object is a vector or returns “FALSE"
if the object is not a vector. See an example below for a better understanding.
> #Checking if the object provided is a vector or not
> is.vector(seq_1)
[1] TRUE
> is.vector("a") #Even an object with single text is also a vector.
[1] TRUE
If you see the code above seq_1
is a vector that we have created in the previous example and checking on if that’s a vector using is.vector()
returns output as TRUE
. Meaning seq_1
is a vector. Also, in the second example, we have tried to provide “a
” as an input under the function and it returns the output TRUE
as well. Because basic data type under the R Programming is a vector.
The as.vector() Function in R Programming
There comes a situation where we wanted to convert an object into a vector (very rare, but may occur). Thankfully, in R we have a function named as.vector()
that takes an object as an argument and then converts it into a vector. We feed an argument to this function (object) and then the function converts it into a vector.
Consider an example where we create a matrix with two rows and two columns. See the code below:
#Creating a matrix with two rows and two columns
> mymat <- matrix(c(2, 3, 4, 5), nrow = 2, ncol = 2, byrow = TRUE)
> print(mymat)
[,1] [,2]
[1,] 2 3
[2,] 4 5
> print(class(mymat)) # Printing class of mymat
[1] "matrix" # mymat is a matrix
Now, we are going to use as.vector()
function on this matrix and convert it into a vector. See the code below for a better realization.
#converting matrix to a vector
> mat_to_vect <- as.vector(mymat)
> print(mat_to_vect)
[1] 2 4 3 5 #mat_to_vect is a vector now
Here we can clearly see that the matrix named “mymat
” is now converted into a vector instead of a 2x2
matrix.
The any() Function in R Programming
In our day to day life, when we come up with a situation where we need to check if any element from the given vector follows a certain logical condition or not, we use any()
function in R. If the specified element or set of elements following logical condition is in your vector, it returns TRUE
else returns FALSE
as an output.
In short, this function checks if a certain set of element/s is present or not within the input vector. See an example for any()
function shown below.
#Example 1
> x <- seq(from= 1, to= 5, by= 0.5)
> any(x < 0)
[1] FALSE
#Example 2
> p <- seq(from= 2, to= 6, by= 1)
> q <- seq(from= 1, to= 5, by= 1)
> any(p + q > 4)
[1] TRUE
Here, in the first example, we have created a vector of numbers 1 to 5 and we are trying to check if we have any value under x lesser than zero. Since we have all positive values under x, we are getting the result as FALSE.
For the second example, we have defined two vectors p and q of numbers 2 to 6 and 1 to 5 respectively. now we want to check if any value of p + q is greater than four. We are getting the result as TRUE because the logical condition holds in this case and we can see some elements for which the sum of p + q is greater than four.
The all() function in R Programming
when we need to check if any of the elements satisfying the logical condition is present in our vector, we have any() function as discussed above. However, when we need to check if all of the specified elements (for which the logical condition holds) are present in your vector or not, we are going to use all() function.
- When you want to check if any of the elements from the given vector is following the logical condition or not, use “
any()
“. - When you want to check if all of the elements from the given vector are following the logical condition or not, use “
all()
“.
#Example 1
> x <- seq(from= 1, to= 5, by= 0.5)
> all(x > 0) #All elements from vector x are positive numbers? YES
[1] TRUE
#Example 2
> q <- seq(from= 1, to= 5, by= 1)
> all(p + q > 4) #All elements from p and q have sum greater than 4? NO
[1] FALSE
Let us consider the same vectors that we have used for the any()
function. in Example 1, we are checking if all the values from vector x are positive numbers or not. Since the vector contains elements 1 to 5 which are positive, the condition holds here and we get the output as TRUE.
On the other hand, when I check if the sum of all elements from p and q is greater than 4, I am getting the result as FALSE. Reason? because not all the elements sum up greater than 4 in p and q.
Note: When I say p + q, it means element from the first position at p and q is summing up, the element from the second position at p and q are summing up, and so on. One to one mapping, keep that in mind.
The lapply() Function in R Programming
You ever come up with a situation where you need to apply a certain function to all of the elements of a vector? pretty much the situation right? However, R is covered for such situations. The lappy function in R helps you in such situations. This function can be fed with an input vector and a function that you need to apply to all elements of the vector. The result this function generates is however a list. Even if you provide a vector as an input, this function will return the output as a list after successfully applying the function on each element of the vector.
#Defining a vector with names of super heroes
> SuperHeroes <- c("IRON MAN", "CAPTAIN AMERICA", "THOR", "HULK")
#Trying to convert the names in lower case
> SuperHeroes <- lapply(SuperHeroes, tolower)
> print(SuperHeroes)
[[1]]
[1] "iron man"
[[2]]
[1] "captain america"
[[3]]
[1] "thor"
[[4]]
[1] "hulk"
In the example above, we have created a vector with the name “SuperHeroes
” that contains names of super heroes all in upper case. The tolower() function we have used changes the case of each word to it’s lower and as a result, you can see all the names under the vector are in lower case. However, the final list will be a list instead of a vector. The lapply()
function only generates a list as an output.
After the execution of this function, we get a list as an output. The “l” under the function name represents the list output and the function itself was initially developed for working with lists. Having said that, the function can also work fine on vectors and data frames.
The sapply() Function in R Programming
When we use lapply()
, the output will always be a list and it would be a difficult situation to always get a list as an output for whatever object you provide. We needed something that works smartly while generating an output after applying the function to all elements of the object. Therefore, we have a more simplified and smart function called “sapply()
” under R. There is no such difference between these two functions at the execution level.
The “sapply()
” function is a smart function that checks at the time of execution for a better output object conversion. meaning, if you are providing a list as an input and the function found that the output can be returned as a vector after execution, it returns the output as a vector. The function itself decides whether the output should be as a vector, a list, a matrix, a data frame, or an array after applying the function to the input object.
> SuperHeroes <- c("IRON MAN", "CAPTAIN AMERICA", "THOR", "HULK")
> SuperHeroes <- sapply(SuperHeroes, tolower)
> print(SuperHeroes)
IRON MAN CAPTAIN AMERICA THOR HULK
"iron man" "captain america" "thor" "hulk"
In the example code given above, we have applied the sapply()
function on the vector named “SuperHeroes
“. Under the result section, you can see that instead of returning a list as an output, the system returns a vector that contains the original values and the updated values below it.
Summary
- Vectors are important building blocks of R programming and so does the vector functions.
- There are specifically designed functions that either return a vector as an output or take a vector as an argument.
- The
rep()
and the seq() functions return a vector as an output. - The functions like
is.vector()
,as.vector()
,any()
,all()
,lapply()
, andsapply()
take a vector as an argument and do certain tasks as per their definitions.
We will stop this article here and get back to you with new interesting facts under R programming under our next article. Until then, stay safe! keep enhancing! 🙂