easydb project:
easy database interface for SQLite, MS Access, MS Excel and MySQL
[[ Jump to the detailed description ]]
easydb (and its sub--packages) allows you
to read and write data from / to SQLite, MS Access, MS Excel and MySQL databases, very
simply (see below).
easydb works on top of RSQLite and RODBC. easydb subsetting style
is similar to the dfdb package (that brought the idea).
Step 1: describe the database
library( "easyrsqlite" ) # if SQLite
library( "easyrodbcaccess" ) # if MS Access
library( "easyrodbcexcel" ) # If MS Excel
library( "easyrodbcmysql" ) # if MySQL
# For a MS Access database
myDb <- edb( dbName = "soils.mdb", dbType = "RODBC_Access" )
# For a SQLite database
myDb <- edb( dbName = "soils.db", dbType = "RSQLite_SQLite" )
Notes: This step is the only operation that vary with database type.
myDb is not a connexion. It doesn't need to be "closed" when operations
are finished.
Step 2: inspect the database
# List existing tables
res <- edbNames( myDb )
# List columns in a table
res <- edbColnames( myDb, tableName = "SOIL_PROFILES" )
Step 3: Retrieve data:
# data.frame style subseting, with single square brackets [ ]
# Get the table "SOIL_PROFILES"
res <- myDb[ "SOIL_PROFILES" ]
# Or retreive only some columns "ID_PROFILE" and "NAME"
res <- myDb[ "SOIL_PROFILES", , c("ID_PROFILE","NAME") ]
# Or retreive only some rows and some columns (id 1 to 5)
res <- myDb[ "SOIL_PROFILES", list( "ID_PROFILE" = 1:5 ), c("ID_PROFILE","NAME") ]
Notes: No need of SQL queries, easydb does it internally (for basic
operations). More row constrains can be passed in list( ... )
Step 4: Write data:
# Add data to a table
myDb[ "SOIL_PROFILES" ] <- some_data_frame
easydb: detailed package
description
About easydb
The easydb provides the abstraction layer (object classes) that
are used by the database specific sub-packages easyrsqlite, easyrodbcaccess,
easyrodbcexcel, and easyrodbcmysql.
easydb is an R package providing functions to easily
read and write data from / to SQLite, MS Access, MS Excel and MySQL
databases, and perform a few other operations. easydb provide the same interface (set of functions) for all these databases. It is build on
top of RSQLite and RODBC, but tries to hide
tedious operations
such as opening and closing database connections, or writing SQL
queries. easydb provides S3 classes functions to manipulate databases in a similar way as a data.frame
(single square brackets [ ] subsetting). Nonetheless it does not work
exactly as a data.frame, because a data.frame have rows and
columns, while a databases have tables, rows and columns. I
lately realized that the idea has actually been already been
implemented since 2009 by John Fox and Oswaldo Cruz in their dfdb package
(very similar to easydb for some things). I hope you will find the
package useful and not too buggy (please report me any problem).
Credit and license
Project page on rforge
Feature list
- Common database interface for SQLite, MS Access, MS Excel and MySQL
(same functions, same arguments);
- Data fetching, subsetting and writing similar to data.frame operation
(single square brackets "myDb[,,]")
- No connections to open or close (done internally);
- In case of operation failure, the database connection is closed
internally (no open connections in the wild);
- Table subsetting is SQL free (select only some rows or columns); SQL
is internal (edbRead() or "db[,,]");
- Can create tables, append, overwrite or update data in a table with
the same function (edbWrite()
or "db[,,] <- value" );
- Can
retrieve the Primary Key ID automatically attributed (edbWrite()),
which is usefull for database referential integrity;
- Fetch table dimensions (edbDim()),
number of rows or columns (edbNRow()
and edbNCol());
- Possibility to pass SQL constrains for rows subsetting;
- Implements SQL ORDER BY and SQL DISTINCT internally (equivalent to R order() and unique());
- Columns substetting with column names, integer indexes or vectors of
logicals;
- S3 class methods are used to define database operations for each
database type;
What easydb doesn't do
- Completely replace RSQLite or RODBC. For more complex operations, you
will need to know / use these packages;
- More complex SQL operations like joins, ...;
- Renaming tables and columns;
- Double square brackets "[[]]" or dollar "$" table subsetting;
- and a lot more;
Note: this is not a roadmap. I
may implement some of these features in the future (especially those I
need), but I don't intend to create the ultimate / universal R Swiss
Army Knife for handling databases :o). Any code contribution is more
than welcome. Report me any bug you may find.
Similar R packages
- dfdb
(easydb wouldn't be there if I hadn't re-invented the wheel);
- sqlitedf
(store data.frame's in SQLite databases)
- bigmemory
(massive datasets);
- ff;
- sqldf
(easydb's reverse mirror: subset data.frame's with SQL!);
- data.table
(fast data.frame's)
- dbframe
(SQL free SQLite database subsetting. New)
Last updated: 2011-12-15