Digital Developer Conference: a FREE half-day online conference focused on AI & Cloud – North America: Nov 2 – India: Nov 9 – Europe: Nov 14 – Asia Nov 23 Register now

Close outline
  • United States
IBM?
  • Site map
IBM?
  • Marketplace

  • Close
    Search
  • Sign in
    • Sign in
    • Register
  • IBM Navigation
IBM Developer Answers
  • Spaces
    • Blockchain
    • IBM Cloud platform
    • Internet of Things
    • Predictive Analytics
    • Watson
    • See all spaces
  • Tags
  • Users
  • Badges
  • FAQ
  • Help
Close

Name

Community

  • Learn
  • Develop
  • Connect

Discover IBM

  • ConnectMarketplace
  • Products
  • Services
  • Industries
  • Careers
  • Partners
  • Support
10.190.13.195

Refine your search by using the following advanced search options.

Criteria Usage
Questions with keyword1 or keyword2 keyword1 keyword2
Questions with a mandatory word, e.g. keyword2 keyword1 +keyword2
Questions excluding a word, e.g. keyword2 keyword1 -keyword2
Questions with keyword(s) and a specific tag keyword1 [tag1]
Questions with keyword(s) and either of two or more specific tags keyword1 [tag1] [tag2]
To search for all posts by a user or all posts with a specific tag, start typing and choose from the suggestion list. Do not use a plus or minus sign with a tag, e.g., +[tag1].
  • Ask a question

i keep getting sql state error when running big r functions on my all numeric dataset e.g. sample() function does not work?

3100008S2S gravatar image
Question by brozzie  (1) | Apr 26, 2015 at 07:05 AM hadoopibmcloudanalytics for hadoopbigr

the error is this:

Error: BigR[.bigr.executeJaqlQuery]: Error: BigR[.bigr.jdbc.query.helper]: Error code : -1, SQLState : 02001 Caused by : originating expression ends at (line: 1, column: 6636): java.lang.RuntimeException: ending quote missing in field starting at position 193 originating expression ends at (line: 1, column: 6636)

My code is this:

options(java.home="C:\\Program Files\\Java\\jre1.8.0_31\\") library(bigr) bigr.connect(host="198.11.249.112", port=7052,user="biblumix", password="mypassword") is.bigr.connected() bigr.listfs("/user/biblumix") crime <- bigr.frame(dataSource="DEL",dataPath="/user/biblumix/Crimes_-_2001_to_present.csv", delimiter=",", header=T, useMapReduce=F)

crime <- na.omit(crime) bf<-crime bf$Arrested <- ifelse(bf$"Arrest" == "true", 1, 0) bf$Homicide <- ifelse(bf$"Primary Type" == c("HOMICIDE"), 1, 0) bf$Narcotics<- ifelse(bf$"Primary Type" == c("NARCOTICS"), 1, 0)

bf$ID_n<-as.numeric(bf$ID)

f$ID_n<-1 bf$Arrested_n<-as.numeric(bf$Arrested) bf$Homicide_n<-as.numeric(bf$Homicide) bf$Narcotics_n<-as.numeric(bf$Narcotics) bf$Year_n<-as.numeric(bf$Year) bf$Latitude_n<-as.numeric(bf$Latitude) bf$Longitude_n<-as.numeric(bf$Longitude)

bf$"Case Number"<-NULL bf$"Date"<-NULL bf$"Block"<-NULL
bf$"IUCR"<-NULL
bf$"Primary Type"<-NULL
bf$"Description"<-NULL
bf$"Location Description"<-NULL
bf$"Arrest"<-NULL
bf$"Domestic"<-NULL
bf$"Beat"<-NULL
bf$"District"<-NULL
bf$"Ward"<-NULL
bf$"Community Area"<-NULL
bf$"FBI Code"<-NULL
bf$"X Coordinate"<-NULL
bf$"Y Coordinate"<-NULL
bf$"Updated On"<-NULL
bf$"Location"<-NULL

bf$ID<-NULL bf$Arrested<-NULL bf$Homicide<-NULL bf$Narcotics<-NULL bf$Year<-NULL bf$Latitude<-NULL bf$Longitude<-NULL

print("pb0") attach(bf) bf_subset <- bf[Year_n == 2014 & (Homicide_n == 1), c("ID_n", "Arrested_n", "Homicide_n", "Narcotics_n", "Year_n", "Latitude_n","Longitude_n")] detach(bf) print("pb1")

bf_subset_samples <- bigr.sample(bf_subset, c(0.5, 0.5)) print("pb2")

Sample of the processed dataset is this:

head(bf_subset) ID_n Arrested_n Homicide_n Narcotics_n Year_n Latitude_n Longitude_n 1 1 0 1 0 2014 41.75216 -87.60171 2 1 0 1 0 2014 41.84124 -87.70630 3 1 0 1 0 2014 41.84124 -87.70630 4 1 0 1 0 2014 41.79568 -87.77230 5 1 1 1 0 2014 41.74601 -87.55051 6 1 1 1 0 2014 41.74601 -87.55051

many thanks for any help you can give.

People who like this

  0
Comment
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster

3 answers

  • Sort: 
310000FFV8 gravatar image

Answer by OscarD.LaraYejas (1) | Apr 29, 2015 at 04:00 PM

Hello.

Could you share the dataset you are using so I can try to reproduce it on my end?

Looks like there may be some inconsistencies with the data. Perhaps a missing quote/delimiter character. Could you also run these commands to verify?

summary(bf) sumary(bf_subset)

Thanks, Oscar,Hello.

Could you please share the dataset you're using, so I can try it out on my end? Looks like there may be some inconsistency in the data, e.g., a missing delimiter or quote, in the case that the data are quoted.

Could you also run the two commands below and paste the output?

summary(bf)

summary(bf_subset)

Thanks, Oscar

Comment

People who like this

  0   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
270000470T gravatar image

Answer by PDV (56) | May 04, 2015 at 12:10 PM

Hello Oscar, the dataset was downloaded from: https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2

I hope this helps. Regards Paola

Comment

People who like this

  0   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
310000FFV8 gravatar image

Answer by OscarD.LaraYejas (1) | May 04, 2015 at 03:57 PM

Hello.

I ran the same code snippet using the dataset from Paola's link (it's about 1.3GB) on a Bluemix environment and everything just ran fine for me. In the end, I'm able to get the result of the sample (see below).

 > bf_subset_samples <- bigr.sample(bf_subset, c(0.5, 0.5))
 > head(bf_subset_samples[[1]])
    ID_n Arrested_n Homicide_n Narcotics_n Year_n Latitude_n Longitude_n    
 1 21709          0          1           0   2014   41.89529   -87.75821    
 2 21708          0          1           0   2014   41.98571   -87.69210    
 3 21705          0          1           0   2014   41.65428   -87.60604    
 4 21706          1          1           0   2014   41.80925   -87.61794    
 5 21702          0          1           0   2014   41.90113   -87.75345    
 6 21700          1          1           0   2014   41.75173   -87.64026

The only thing I could think of is that perhaps @brozzie is using a different version of the dataset. Just to double check, what I did is that I went to this link:

https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2

And then I selected "Export" and "CSV". The file is 1.3 GB.

If it still doesn't work, could you paste the results of these three commands?

 summary(crime)    
 summary(bf)    
 summary(bf_subset)

Thanks, Oscar

Comment

People who like this

  0   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster

Follow this question

51 people are following this question.

Answers

Answers & comments

Related questions

Cannot connect to IBM Analytics for Hadoop 1 Answer

Unable to find "IBM analytics for Hadoop" service in bluemix 1 Answer

Unable to find Analytics for Apache Hadoop service (US South) 1 Answer

HDFS node is not working in NODE-RED for Bluemix 7 Answers

Not able to launch IBM analytics for hadoop in bluemix, process become unresponve 1 Answer

  • Contact
  • Privacy
  • IBM Developer Terms of use
  • Accessibility
  • Report Abuse
  • Cookie Preferences

Powered by AnswerHub

Authentication check. Please ignore.
  • Anonymous
  • Sign in
  • Create
  • Ask a question
  • Spaces
  • API Connect
  • Analytic Hybrid Cloud Core
  • Application Performance Management
  • Appsecdev
  • BPM
  • Blockchain
  • Business Transaction Intelligence
  • CAPI
  • CAPI SNAP
  • CICS
  • Cloud Analytics
  • Cloud Automation
  • Cloud Object Storage
  • Cloud marketplace
  • Collaboration
  • Content Services (ECM)
  • Continuous Testing
  • Courses
  • Customer Experience Analytics
  • DB2 LUW
  • Data and AI
  • DataPower
  • Decision Optimization
  • DevOps Build
  • DevOps Services
  • Developers IBM MX
  • Digital Commerce
  • Digital Experience
  • Finance
  • Global Entrepreneur Program
  • Hadoop
  • Hybrid Cloud Core
  • Hyper Protect
  • IBM Cloud platform
  • IBM Design
  • IBM Forms Experience Builder
  • IBM Maximo Developer
  • IBM StoredIQ
  • IBM StoredIQ-Cartridges
  • IIDR
  • ITOA
  • InformationServer
  • Integration Bus
  • Internet of Things
  • Kenexa
  • Linux on Power
  • LinuxONE
  • MDM
  • Mainframe
  • Messaging
  • Node.js
  • ODM
  • Open
  • PartnerWorld Developer Support
  • PowerAI
  • PowerVC
  • Predictive Analytics
  • Product Insights
  • PureData for Analytics
  • Push
  • QRadar App Development
  • Run Book Automation
  • Search Insights
  • Security Core
  • Storage
  • Storage Core
  • Streamsdev
  • Supply Chain Business Network
  • Supply Chain Insights
  • Swift
  • UBX Capture
  • Universal Behavior Exchange
  • UrbanCode
  • WASdev
  • WSRR
  • Watson
  • Watson Campaign Automation
  • Watson Content Hub
  • Watson Marketing Insights
  • dW Answers Help
  • dW Premium
  • developerWorks Sandbox
  • developerWorks Team
  • Watson Health
  • More
  • Tags
  • Questions
  • Users
  • Badges