Digital Developer Conference: a FREE half-day online conference focused on AI & Cloud – North America: Nov 2 – India: Nov 9 – Europe: Nov 14 – Asia Nov 23 Register now

Close outline
  • United States
IBM?
  • Site map
IBM?
  • Marketplace

  • Close
    Search
  • Sign in
    • Sign in
    • Register
  • IBM Navigation
IBM Developer Answers
  • Spaces
    • Blockchain
    • IBM Cloud platform
    • Internet of Things
    • Predictive Analytics
    • Watson
    • See all spaces
  • Tags
  • Users
  • Badges
  • FAQ
  • Help
Close

Name

Community

  • Learn
  • Develop
  • Connect

Discover IBM

  • ConnectMarketplace
  • Products
  • Services
  • Industries
  • Careers
  • Partners
  • Support
10.190.13.195

Refine your search by using the following advanced search options.

Criteria Usage
Questions with keyword1 or keyword2 keyword1 keyword2
Questions with a mandatory word, e.g. keyword2 keyword1 +keyword2
Questions excluding a word, e.g. keyword2 keyword1 -keyword2
Questions with keyword(s) and a specific tag keyword1 [tag1]
Questions with keyword(s) and either of two or more specific tags keyword1 [tag1] [tag2]
To search for all posts by a user or all posts with a specific tag, start typing and choose from the suggestion list. Do not use a plus or minus sign with a tag, e.g., +[tag1].
  • Ask a question

File Source Operator - hasHeaderLine parameter only works for the first file received when used with Directory Scan

310001515M gravatar image
Question by DaveFitz  (1) | Jan 13, 2017 at 03:25 AM streamsdevfilesfilesourcedirectories

Hi All,

I am trying to monitor a specific directory and process the files placed there. As part of the initial processing I want to ignore the header (1 line) in each file. DirectoryScan and the Filesource using the hasHeaderLine paramter seemed the perfect solution. In unit testing I was using a single file and it worked fine but now in live testing with multiple files I can see that the hasHeaderLine is only applied to the first file. Subsequent files the 1st line is passed through.

Sample code below (I've tried 1u, true and 2u as potential values)

             stream<rstring filename> InputFiles = DirectoryScan()
             {
                     param
                             directory : "/opt/var/source" ;
             }


             stream<InputRecordStructure> InputRecords = FileSource(InputFiles)
             {
                     param
                             format : csv ;
                             separator : ",";
                             parsing : fast;
                             ignoreExtraCSVValues : true;
                             hasHeaderLine : 1u;
                             moveFileToDirectory : "/opt/var/processed";
             }

People who like this

  0
Comment
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster

3 answers

  • Sort: 
2700050N5R gravatar image

Answer by Bruce Glassford (912) | Jan 13, 2017 at 09:13 AM

Which version are you running? I just did some tests on 4.2.0.2 on my RH6 system here (since this could be a problem for some of our analytics), and it worked fine.

Comment

People who like this

  0   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
110000NW56 gravatar image

Answer by Ruleman (127) | Jan 13, 2017 at 04:47 PM

This is indeed a known defect in release 4.1.1. It looks like it was fixed in 4.1.1 fix pack 2. That was September 2016, right around the same time as the release of 4.2.0. I don't know for sure if the fix is in 4.2.0.0, but by Bruce's observation it's definitely in 4.2.0.2.

Comment

People who like this

  0   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
310001515M gravatar image

Answer by DaveFitz (1) | Jan 16, 2017 at 02:51 AM

Hi Bruce and Ruleman

Thanks for the responses I'm using 4.1.1 - I'll look into applying fix pack 2 and see if that resolves it.

Comment

People who like this

  0   Share
10 |3000 characters needed characters left characters exceeded
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster

Follow this question

104 people are following this question.

Answers

Answers & comments

Related questions

DirectoryScan and FileSource operator 1 Answer

FileSource file permissions issue 2 Answers

How to write incorrectly formatted tuples from FileSource into DB ? 6 Answers

New manual record added in "FILESOURCE" operator is not moving to "FILESINK" operator 1 Answer

Running several FileSource operators 2 Answers

  • Contact
  • Privacy
  • IBM Developer Terms of use
  • Accessibility
  • Report Abuse
  • Cookie Preferences

Powered by AnswerHub

Authentication check. Please ignore.
  • Anonymous
  • Sign in
  • Create
  • Ask a question
  • Spaces
  • API Connect
  • Analytic Hybrid Cloud Core
  • Application Performance Management
  • Appsecdev
  • BPM
  • Blockchain
  • Business Transaction Intelligence
  • CAPI
  • CAPI SNAP
  • CICS
  • Cloud Analytics
  • Cloud Automation
  • Cloud Object Storage
  • Cloud marketplace
  • Collaboration
  • Content Services (ECM)
  • Continuous Testing
  • Courses
  • Customer Experience Analytics
  • DB2 LUW
  • Data and AI
  • DataPower
  • Decision Optimization
  • DevOps Build
  • DevOps Services
  • Developers IBM MX
  • Digital Commerce
  • Digital Experience
  • Finance
  • Global Entrepreneur Program
  • Hadoop
  • Hybrid Cloud Core
  • Hyper Protect
  • IBM Cloud platform
  • IBM Design
  • IBM Forms Experience Builder
  • IBM Maximo Developer
  • IBM StoredIQ
  • IBM StoredIQ-Cartridges
  • IIDR
  • ITOA
  • InformationServer
  • Integration Bus
  • Internet of Things
  • Kenexa
  • Linux on Power
  • LinuxONE
  • MDM
  • Mainframe
  • Messaging
  • Node.js
  • ODM
  • Open
  • PartnerWorld Developer Support
  • PowerAI
  • PowerVC
  • Predictive Analytics
  • Product Insights
  • PureData for Analytics
  • Push
  • QRadar App Development
  • Run Book Automation
  • Search Insights
  • Security Core
  • Storage
  • Storage Core
  • Streamsdev
  • Supply Chain Business Network
  • Supply Chain Insights
  • Swift
  • UBX Capture
  • Universal Behavior Exchange
  • UrbanCode
  • WASdev
  • WSRR
  • Watson
  • Watson Campaign Automation
  • Watson Content Hub
  • Watson Marketing Insights
  • dW Answers Help
  • dW Premium
  • developerWorks Sandbox
  • developerWorks Team
  • Watson Health
  • More
  • Tags
  • Questions
  • Users
  • Badges