Table of Contents
- Where can I find more information about the secure gateway that I need to connect securely to my source database?
- What are the environment details about where I can install my secure gateway?
- How many instances of the Secure Gateway are required per IBM Cloud organization?
- Can I migrate data securely from on-premises data sources?
- What security compliance standards does Lift support?
- Can I keep using my source database while the migration is in progress?
- Can I migrate large databases?
- What artifacts of my database can I migrate?
- What data sources and targets can I use when I migrate data?
- How do I use the grooming process properly when I migrate data from IBM PureData System for Analytics?
- What network ports must be open on my on-premises system?
- What is the architecture of Lift?
- What operations do I need to do outside of my web browser when I use Lift?
- What are the best practices for for attaching and configuring additional storage for PureData System for Analytics?
- What is the Slingshot?
- Do I need to set up the Slingshot for each connection or can I use one installation of the Slingshot for multiple connections?
- What are the specifications for the environment that I can install the Slingshot on?
- What part of my on-premises system should I install the Slingshot on if I’m using IBM PureData System for Analytics sources?
- Where should I install the Slingshot for the best throughput if I’m using IBM PureData System for Analytics as my source?
- What command line interface (CLI) options can I use for Slingshot?
- What is an activity?
- During a Lift maintenance window, what happens to my Lift activity that is running at the time?
Parsing your .csv source
Where can I find more information about the secure gateway that I need to connect securely to my source database?
You can learn more about secure gateways and the Secure Gateway service by visiting About Secure Gateway. You can configure your secure gateway directly from the Lift console, so you don’t have to configure a separate Secure Gateway service in IBM Cloud.
What are the environment details about where I can install my secure gateway?
See Setting up a client for information on hardware and software specifications for the secure gateway client.
How many instances of the Secure Gateway are required per IBM Cloud organization?
Only one instance of the Secure Gateway service is required per IBM Cloud organization, which includes 250 concurrent connections per client.
Can I migrate data securely from on-premises data sources?
Yes. By using IBM’s Secure Gateway service along with IBM’s Aspera secure data transfer technology, you can migrate data securely when you work with on-premises data sources.
What security compliance standards does Lift support?
Lift complies with SOC 2 Type 1 security standards. For more information on security compliance, see About Lift on IBM Cloud.
Can I keep using my source database while the migration is in progress?
Yes. You can enable continuous replication, which updates additional data changes made in the source database that occur while the data is being moved. Doing so helps to eliminate downtime on your source database. Continuous replication is enabled only when you’re migrating data from PureData System for Analytics to Db2 Warehouse on Cloud (formerly dashDB).
Can I migrate large databases?
Yes. You can migrate any size database. But keep in mind that the duration of your database migration depends on your network connection speed, the amount of uncompressed data that you need to move, and the hardware profiles of your source and target computers.
What artifacts of my database can I migrate?
You can migrate your database schema and associated data tables. If you need help migrating other artifacts, such as stored procedures or metadata, contact our IBM Analytics Services team.
What data sources and targets can I use when I migrate data?
Currently, you can migrate data from IBM Pure Data for Analytics and CSV files to IBM Db2 Warehouse on Cloud as well as migrating from IBM DB2 on premises and CSV files to IBM DB2 on Cloud.
How do I use the grooming process properly when I migrate data from IBM PureData System for Analytics?
Lift detects incremental changes in a source PureData System for Analytics database by looking at historical row versions. Historical row versions in a PureData System for Analytics database must be “groomed” (removed) to allow the PureData System for Analytics database to function efficiently. However, if historical row versions are removed before Lift processes the row versions, Lift does not properly detect the changes that occur in the source PureData System for Analytics database.
To ensure that row versions are not removed before Lift processes them, Lift provides a mechanism to remove row versions only after they are processed by Lift.
Lift provides a script (bin/liftreclaim.sh) that must be used to remove row versions. This script is part of the Slingshot installation and must be moved to the PureData System for Analytics source. All grooming of historical row versions must be done by using this script. If removal is done by using a different method, Lift does not properly detect changes in the source PureData System for Analytics database.
|dashDB SSL-secured JDBC||TCP||OUTBOUND||INTERNET||50001|
|dashDB REST Load API||TCP||OUTBOUND||INTERNET||8443|
- The computer that Slingshot runs on must be able to connect to the source by using the JDBC API.
** The port number depends on the source connection configuration.
1: There will be incoming returned traffic when the OUTBOUND connection has been initiated by the Slingshot toward dashDB cluster on port 33001. The local port, which will be one in the ephemeral port range, will be randomly chosen by the operating system. All modern firewalls are stateful (or connection-aware or state-aware) and it is expected that there will be no need to open any INBOUND port.
For Aspera transfer firewall considerations see: https://support.asperasoft.com/hc/en-us/articles/216127518-Firewall-Considerations-.
2: Make sure any firewalls allow port 443 to connect to the URLs https://dataworks-lift-accelerator-gen4-yp.mybluemix.net/ and https://dataworks-lift-accelerator-gen4-lyp.eu-gb.mybluemix.net/.
3: Make sure any firewalls allow port 80 to connect to the URL http://s1.symcb.com/.
- Installing your secure gateway, which includes:
- Adding the source database IP/hostname and port to the ACL list (except when the source is CSV format).
- Adding the Slingshot IP/hostname and port to the ACL list.
- Installing the Slingshot.
- Generating the DDL file to create the tables on the IBM dashDB database.
What are the best practices for attaching and configuring additional storage for PureData System for Analytics?
The following technotes provide best practices for attaching and configuring additional storage for PureData System for Analytics:
- Adding SAN Storage to PureData Systems for Analytics
- IBM PureData System for Analytics: Mounting NFS on the appliance
- Mounting NFS filesystem on PureData for Analytics systems
What is the Slingshot?
The Lift Slingshot is a small app that runs close to your source database and supercharges your data transfer to the cloud. You must set up the Slingshot as part of your source database connection before you can begin a migration.
Do I need to set up the Slingshot for each connection or can I use one installation of the Slingshot for multiple connections?
The Slingshot at the moment is best configured per connection, so you will need a different installation of the Slingshot per on-premises connection, which requires that you use a different port for each connection. However, you can use the same installation of the Slingshot for different connections with the following limitations:
- One connection can be run each time.
- Before starting the connection, you must provide the proper secure token for that connection.
- Red Hat Enterprise Linux 7.2 operating system
- 8 CPU cores
- 16 GB RAM
- 1 TB – 500 GB of disk space
- Docker software installed if the Docker version of Secure Gateway is to be used
What part of my on-premises system should I install the Slingshot on if I’m using IBM PureData System for Analytics sources?
When you are installing the Slingshot for PureData System for Analytics sources, install the Slingshot on your “injection” system (the system that you use to stage data to load into the PureData System for Analytics database). That system will have good connectivity to the PureData System for Analytics and will have lots of disk space for staging data. However, if your injection system is already fully loaded, then install the Slingshot on a similar system that is similarly connected and with significant staging disk space.
Where should I install the Slingshot for the best throughput if I’m using IBM PureData System for Analytics as my source?
We strongly recommend that you install the Slingshot on a Linux machine. When your Slingshot is installed on a Linux machine, data extraction from PureData System for Analytics sources is done with high speed unload facilities. Install your Slingshot on Linux for significantly better overall throughput when your source is PureData System for Analytics.
|start||start the Slingshot|
|stop||stop the Slingshot|
|restart||stop and start the Slingshot|
|version||get version information|
|version verbose||get verbose version information|
|newtoken <value>||specify a new security token for this Slingshot|
|newdiskfolder <path>||specify a new folder that will contain your CSV files and extract directory|
|newdiskmax <value>||specify a new limit to how much disk space can be consumed by the Slingshot for data extract|
|newport <value>||specify a new port that the Slingshot will use to communicate with IBM Cloud Lift service|
During a Lift maintenance window, what happens to my Lift activity that is running at the time?
The updated version of the service automatically takes over the activity. No action is required of you. Events are generated in the job log that show the activity initially pausing because the old service is no longer communicating with the activity and then being automatically resumed by the new service.
This same behavior occurs if there is at any point an unexpected outage of the Lift service or a network outage that impacts connectivity between the service and the Slingshot.
Parsing your .csv source
If you experience problems, you can perform troubleshooting tasks to determine the corrective action to take, search documentation, or contact support by visiting the troubleshooting help section. You can also search posts from the dwAnswers or Stack Overflow communities.