the error is this:
Error: BigR[.bigr.executeJaqlQuery]: Error: BigR[.bigr.jdbc.query.helper]: Error code : -1, SQLState : 02001 Caused by : originating expression ends at (line: 1, column: 6636): java.lang.RuntimeException: ending quote missing in field starting at position 193 originating expression ends at (line: 1, column: 6636)
My code is this:
options(java.home="C:\\Program Files\\Java\\jre1.8.0_31\\") library(bigr) bigr.connect(host="198.11.249.112", port=7052,user="biblumix", password="mypassword") is.bigr.connected() bigr.listfs("/user/biblumix") crime <- bigr.frame(dataSource="DEL",dataPath="/user/biblumix/Crimes_-_2001_to_present.csv", delimiter=",", header=T, useMapReduce=F)
crime <- na.omit(crime) bf<-crime bf$Arrested <- ifelse(bf$"Arrest" == "true", 1, 0) bf$Homicide <- ifelse(bf$"Primary Type" == c("HOMICIDE"), 1, 0) bf$Narcotics<- ifelse(bf$"Primary Type" == c("NARCOTICS"), 1, 0)
f$ID_n<-1 bf$Arrested_n<-as.numeric(bf$Arrested) bf$Homicide_n<-as.numeric(bf$Homicide) bf$Narcotics_n<-as.numeric(bf$Narcotics) bf$Year_n<-as.numeric(bf$Year) bf$Latitude_n<-as.numeric(bf$Latitude) bf$Longitude_n<-as.numeric(bf$Longitude)
bf$"Case Number"<-NULL bf$"Date"<-NULL bf$"Block"<-NULL
bf$"IUCR"<-NULL
bf$"Primary Type"<-NULL
bf$"Description"<-NULL
bf$"Location Description"<-NULL
bf$"Arrest"<-NULL
bf$"Domestic"<-NULL
bf$"Beat"<-NULL
bf$"District"<-NULL
bf$"Ward"<-NULL
bf$"Community Area"<-NULL
bf$"FBI Code"<-NULL
bf$"X Coordinate"<-NULL
bf$"Y Coordinate"<-NULL
bf$"Updated On"<-NULL
bf$"Location"<-NULL
bf$ID<-NULL bf$Arrested<-NULL bf$Homicide<-NULL bf$Narcotics<-NULL bf$Year<-NULL bf$Latitude<-NULL bf$Longitude<-NULL
print("pb0") attach(bf) bf_subset <- bf[Year_n == 2014 & (Homicide_n == 1), c("ID_n", "Arrested_n", "Homicide_n", "Narcotics_n", "Year_n", "Latitude_n","Longitude_n")] detach(bf) print("pb1")
bf_subset_samples <- bigr.sample(bf_subset, c(0.5, 0.5)) print("pb2")
Sample of the processed dataset is this:
head(bf_subset) ID_n Arrested_n Homicide_n Narcotics_n Year_n Latitude_n Longitude_n 1 1 0 1 0 2014 41.75216 -87.60171 2 1 0 1 0 2014 41.84124 -87.70630 3 1 0 1 0 2014 41.84124 -87.70630 4 1 0 1 0 2014 41.79568 -87.77230 5 1 1 1 0 2014 41.74601 -87.55051 6 1 1 1 0 2014 41.74601 -87.55051
many thanks for any help you can give.
Answer by OscarD.LaraYejas (1) | Apr 29, 2015 at 04:00 PM
Hello.
Could you share the dataset you are using so I can try to reproduce it on my end?
Looks like there may be some inconsistencies with the data. Perhaps a missing quote/delimiter character. Could you also run these commands to verify?
summary(bf) sumary(bf_subset)
Thanks, Oscar,Hello.
Could you please share the dataset you're using, so I can try it out on my end? Looks like there may be some inconsistency in the data, e.g., a missing delimiter or quote, in the case that the data are quoted.
Could you also run the two commands below and paste the output?
summary(bf)
summary(bf_subset)
Thanks, Oscar
Answer by PDV (56) | May 04, 2015 at 12:10 PM
Hello Oscar, the dataset was downloaded from: https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2
I hope this helps. Regards Paola
Answer by OscarD.LaraYejas (1) | May 04, 2015 at 03:57 PM
Hello.
I ran the same code snippet using the dataset from Paola's link (it's about 1.3GB) on a Bluemix environment and everything just ran fine for me. In the end, I'm able to get the result of the sample (see below).
> bf_subset_samples <- bigr.sample(bf_subset, c(0.5, 0.5))
> head(bf_subset_samples[[1]])
ID_n Arrested_n Homicide_n Narcotics_n Year_n Latitude_n Longitude_n
1 21709 0 1 0 2014 41.89529 -87.75821
2 21708 0 1 0 2014 41.98571 -87.69210
3 21705 0 1 0 2014 41.65428 -87.60604
4 21706 1 1 0 2014 41.80925 -87.61794
5 21702 0 1 0 2014 41.90113 -87.75345
6 21700 1 1 0 2014 41.75173 -87.64026
The only thing I could think of is that perhaps @brozzie is using a different version of the dataset. Just to double check, what I did is that I went to this link:
https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2
And then I selected "Export" and "CSV". The file is 1.3 GB.
If it still doesn't work, could you paste the results of these three commands?
summary(crime)
summary(bf)
summary(bf_subset)
Thanks, Oscar
Cannot connect to IBM Analytics for Hadoop 1 Answer
Unable to find "IBM analytics for Hadoop" service in bluemix 1 Answer
Unable to find Analytics for Apache Hadoop service (US South) 1 Answer
HDFS node is not working in NODE-RED for Bluemix 7 Answers
Not able to launch IBM analytics for hadoop in bluemix, process become unresponve 1 Answer