Monday, July 17, 2017

Language: R (for database query)......


MongoDB database uses JSON-like documents with schemas
download,  install, start MongoDB
mongod

#mongolite is R package to start MongoDB
library(ggplot2)
library(dplyr)
library(maps)
library(ggmap)
library(mongolite)
library(lubridate)
library(gridExtra)

#Obtain JSON data from diff sources (for example from https://catalog.data.gov/dataset/crimes)
crimes=data.table::fread("Crimes_2001_to_present.csv")
names(crimes)

# remove spaces in the column names for convenience of query
names(crimes) = gsub(" ","",names(crimes)) 
names(crimes)

#create a database called Chicago and call the collection crimes. Check count of records
my_collection = mongo(collection = "crimes", db = "Chicago") 
my_collection$insert(crimes)
my_collection$count()

#show one data
my_collection$iterate()$one()

#no. of distinct “Primary Type”
length(my_collection$distinct("PrimaryType"))
# no. of domestic assualts in the collection.
my_collection$count('{"PrimaryType" : "ASSAULT", "Domestic" : "true" }')
82470


#get the  columns of interest
query1= my_collection$find('{"PrimaryType" : "ASSAULT", "Domestic" : "true" }')
query2= my_collection$find('{"PrimaryType" : "ASSAULT", "Domestic" : "true" }',
                           fields = '{"_id":0, "PrimaryType":1, "Domestic":1}')
ncol(query1) # with all the columns
ncol(query2) # only the selected columns

No comments:

Post a Comment

Laboratory tools and reagents (Micro-pipettes)...

Micro-pipettes are essential tools of R & D labs, and integral part of Good Laboratory Practices (GLPs) Micro-pipetting methods include ...