Sunteți pe pagina 1din 1

Cheat Sheet – Week 4, Input Data into R

Function Example Options (parameters) Description


It invokes a text editor to enter data manually. A empty data frame needs to be
created first, then edit() needs to be invoked.
edit() df<-edit(df) fix(df), it adds more elements into the data frame
edit() needs to be assigned to a data frame, otherwise the data will be erased
once the editor is closed.
header = TRUE, then the first row will be treated as the column
df<-read.table(text=file) names.
read.table() df<-read.table(“file.csv”) sep = ““ white space is the default separator. It reads delimited text file in table formant and saves it as a data frame.
df<-read.table(“URL”) text=file, text is the default format of file.
URL is a web address.
header = TRUE is the default
df<-read.delim(“file.txt”) sep = “\t“ tab is the default separator. Another method to import data from a delimited text file. I tis a special case of
read.delim
df<-read.delim(“URL”) row.names = “variable_name” to select a specific index for each row. read.table()
URL is a web address.
header = TRUE is the default
df<-read.csv(“file.csv”) sep = “,“ comma is the default separator.
read.csv It can read both csv (comma-separated values) in both txt and csv files.
df<-read.csv(“URL”) row.names = “variable_name” to select a specific index for each row.
URL is a web address.

source() source(“file.r”) It processes commands from a script file

readLines() df<-readLines(“file.txt”) n=p, pass p as the number of lines to read It reads unstructured text, line by line.

Characters are not coerced to factors.


Sheet=“Sheetn”, where n is number of sheet to be read.
This function comes in readxl package. This package works with both .xls and
read_excel() df<-read_excel(file.xlsx“) skip=n, where n is the number of lines to be skipped.
.xlsx formats
range=“An:Dm”, it specifies the range of data to be read, e.g.,
“A1:D5”
url<-”http://www.webaddress.com”, it stores the website address This function provides methods for extracting data from HTML tables in a HTML
df<-readHTMLTable(
readHTMLTable() urldata<-GET(url), it captures the data from website. document.
rawToChar(urldata$content))
which = n, it selects specific table XML and httr packages are needed.
This function provides methods for extracting data from PDF files. It imports the
url is the web address. raw text in a character vector with spaces to show the white space and \n to
pdf_text() vector<-pdf_text(“url”) It is possible to use the name of the file (e.g.,“file.pdf”) to import the show line breaks.
data. pdftools package is needed.
it returns a vector, each element of the vector contains each page of the pdf file.
df = a data frame. ©2020, Dr. Raul Fernandez-Rojas, University of Canberra
Common delimiters (sep =): spaces (“”), commas (“,”), tabs (“\t”)
Note: Remember that most functions that create a data frame, they coerce characters into factors. Pass stringsAsFactor=FALSE to avoid that behavior.

S-ar putea să vă placă și