博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Using Sqoop 1.4.6 With Hadoop 2.7.4
阅读量:7032 次
发布时间:2019-06-28

本文共 26994 字,大约阅读时间需要 89 分钟。

本文主要描述Sqoop 1.4.6的安装配置以及使用。

一、安装配置
1、Sqoop安装

[hadoop@hdp01 ~]$ wget http://mirror.bit.edu.cn/apache/sqoop/1.4.6/sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz[hadoop@hdp01 ~]$ tar -xzf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz[hadoop@hdp01 ~]$ mv sqoop-1.4.6.bin__hadoop-2.0.4-alpha /u01/sqoop--编辑Sqoop环境变量[hadoop@hdp01 ~]$ cd /u01/sqoop/conf[hadoop@hdp01 conf]$ cp sqoop-env-template.sh sqoop-env.sh[hadoop@hdp01 conf]$ vi sqoop-env.shexport HADOOP_COMMON_HOME=/u01/hadoopexport HADOOP_MAPRED_HOME=/u01/hadoopexport HBASE_HOME=/u01/hbaseexport HIVE_HOME=/u01/hiveexport ZOOCFGDIR=/u01/zookeeper/conf--注释掉configure-sqoop中的以下内容#if [ -z "${HCAT_HOME}" ]; then#  if [ -d "/usr/lib/hive-hcatalog" ]; then#    HCAT_HOME=/usr/lib/hive-hcatalog#  elif [ -d "/usr/lib/hcatalog" ]; then#    HCAT_HOME=/usr/lib/hcatalog#  else#    HCAT_HOME=${SQOOP_HOME}/../hive-hcatalog#    if [ ! -d ${HCAT_HOME} ]; then#       HCAT_HOME=${SQOOP_HOME}/../hcatalog#    fi#  fi#fi#if [ -z "${ACCUMULO_HOME}" ]; then#  if [ -d "/usr/lib/accumulo" ]; then#    ACCUMULO_HOME=/usr/lib/accumulo#  else#    ACCUMULO_HOME=${SQOOP_HOME}/../accumulo#  fi#fi## Moved to be a runtime check in sqoop.#if [ ! -d "${HCAT_HOME}" ]; then#  echo "Warning: $HCAT_HOME does not exist! HCatalog jobs will fail."#  echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'#fi##if [ ! -d "${ACCUMULO_HOME}" ]; then#  echo "Warning: $ACCUMULO_HOME does not exist! Accumulo imports will fail."#  echo 'Please set $ACCUMULO_HOME to the root of your Accumulo installation.'#fi--编辑用户环境环境变量[hadoop@hdp01 ~]$ vi .bash_profileexport SQOOP_HOME=/u01/sqoopexport SQOOP_CONF_DIR=$SQOOP_HOME/confexport SQOOP_CLASSPATH=$SQOOP_CONF_DIRexport PATH=$PATH:$SQOOP_HOME/bin[hadoop@hdp01 ~]$ source .bash_profile--验证sqoop安装[hadoop@hdp01 ~]$ sqoop version2017-12-28 09:30:01,801 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6Sqoop 1.4.6git commit id c0c5a81723759fa575844a0a1eae8f510fa32c25Compiled by root on Mon Apr 27 14:38:36 CST 2015或者运行sqoop-version--拷贝jdbc驱动将MySQL、PostgreSQL以及Oracle的jdbc驱动拷贝到$SQOOP_HOME/lib

二、Sqoop使用

1、Sqoop测试各个jdbc驱动连接
1.1 Sqoop与MySQL的连接

[hadoop@hdp01 bin]$ sqoop list-tables --username root -P --connect jdbc:mysql://192.168.120.92:3306/smsqw?useSSL=false2017-12-28 09:38:19,587 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6Enter password: 2017-12-28 09:38:23,067 [myid:] - INFO  [main:MySQLManager@69] - Preparing to use a MySQL streaming resultset.PhoneTestPhonehistory_storetbAreaprefixtbAreaprefix_baktbBilltbBilltmptbCattbContacttbDataPathtbDeliverMsgtbDeliverMsg2tbDesttbLocPrefixtbMessagetbPricetbReceivertbSSLogtbSendStatetbSendState2tbSmsSendStatetbTesttbUser

1.2 Sqoop与PostgreSQL的连接

[hadoop@hdp01 ~]$ sqoop list-tables --username rhnuser -P --connect jdbc:postgresql://192.168.120.93:5432/rhndb2017-12-28 09:40:24,842 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6Enter password: 2017-12-28 09:40:29,775 [myid:] - INFO  [main:SqlManager@98] - Using default fetchSize of 1000rhnservergroupmembersrhntemplatestringrhnservergrouptypefeaturerhnserverhistoryqrtz_fired_triggers

1.3 Sqoop与Oracle的连接

[hadoop@hdp01 ~]$ sqoop list-tables --username spwuser -P --connect jdbc:oracle:thin:@192.168.120.121:1521/rhndb --driver oracle.jdbc.driver.OracleDriver2017-12-28 10:01:43,337 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6Enter password: 2017-12-28 10:01:43,425 [myid:] - INFO  [main:SqlManager@98] - Using default fetchSize of 1000rhnservergroupmembersrhntemplatestringrhnservergrouptypefeaturerhnserverhistoryqrtz_fired_triggers

1.4 Sqoop与Hive的连接

基于PostgreSQL在hive上创建一个名为rhnpackagefile的表,但不导入数据,后面介绍数据导入。

[hadoop@hdp01 ~]$ sqoop create-hive-table --connect jdbc:postgresql://192.168.120.93:5432/rhndb --table rhnpackagefile --username rhnuser -P --hive-database hivedb2017-12-28 10:32:01,376 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6Enter password: 2017-12-28 10:32:04,699 [myid:] - INFO  [main:BaseSqoopTool@1353] - Using Hive-specific delimiters for output. You can override2017-12-28 10:32:04,699 [myid:] - INFO  [main:BaseSqoopTool@1354] - delimiters with --fields-terminated-by, etc.2017-12-28 10:32:04,819 [myid:] - INFO  [main:SqlManager@98] - Using default fetchSize of 10002017-12-28 10:32:05,015 [myid:] - INFO  [main:SqlManager@757] - Executing SQL statement: SELECT t.* FROM "rhnpackagefile" AS t LIMIT 12017-12-28 10:32:05,674 [myid:] - INFO  [main:HiveImport@194] - Loading uploaded data into Hive2017-12-28 10:32:09,089 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: Class path contains multiple SLF4J bindings.2017-12-28 10:32:09,090 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: Found binding in [jar:file:/u01/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]2017-12-28 10:32:09,090 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: Found binding in [jar:file:/u01/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]2017-12-28 10:32:09,090 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: Found binding in [jar:file:/u01/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]2017-12-28 10:32:09,091 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: Found binding in [jar:file:/u01/tez/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]2017-12-28 10:32:09,091 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: Found binding in [jar:file:/u01/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]2017-12-28 10:32:09,091 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.2017-12-28 10:32:09,095 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]2017-12-28 10:32:11,996 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - 2017-12-28 10:32:11,996 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - Logging initialized using configuration in jar:file:/u01/hive/lib/hive-common-2.3.2.jar!/hive-log4j2.properties Async: true2017-12-28 10:32:16,650 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - OK2017-12-28 10:32:16,783 [myid:] - INFO  [Thread-6:LoggingAsyncSink$LoggingThread@85] - Time taken: 3.433 seconds2017-12-28 10:32:17,248 [myid:] - INFO  [main:HiveImport@242] - Hive import complete.

2、数据迁移

2.1 PostgreSQL☞Hive

[hadoop@hdp01 ~]$ sqoop import --connect jdbc:postgresql://192.168.120.93:5432/rhndb --table rhnpackagefile --username rhnuser -P --fields-terminated-by ',' --hive-import --hive-database hivedb --columns package_id,capability_id,device,inode,file_mode,username,groupname,rdev,file_size,mtime,checksum_id,linkto,flags,verifyflags,lang,created,modified --split-by modified -m 42017-12-28 11:24:46,666 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6Enter password: 2017-12-28 11:24:48,891 [myid:] - INFO  [main:SqlManager@98] - Using default fetchSize of 10002017-12-28 11:24:48,894 [myid:] - INFO  [main:CodeGenTool@92] - Beginning code generation2017-12-28 11:24:49,091 [myid:] - INFO  [main:SqlManager@757] - Executing SQL statement: SELECT t.* FROM "rhnpackagefile" AS t LIMIT 12017-12-28 11:24:49,127 [myid:] - INFO  [main:CompilationManager@94] - HADOOP_MAPRED_HOME is /u01/hadoopNote: /tmp/sqoop-hadoop/compile/ca09f6bb133fa32808220902aedc0437/rhnpackagefile.java uses or overrides a deprecated API.Note: Recompile with -Xlint:deprecation for details.2017-12-28 11:24:50,481 [myid:] - INFO  [main:CompilationManager@330] - Writing jar file: /tmp/sqoop-hadoop/compile/ca09f6bb133fa32808220902aedc0437/rhnpackagefile.jar2017-12-28 11:24:50,493 [myid:] - WARN  [main:PostgresqlManager@119] - It looks like you are importing from postgresql.2017-12-28 11:24:50,493 [myid:] - WARN  [main:PostgresqlManager@120] - This transfer can be faster! Use the --direct2017-12-28 11:24:50,494 [myid:] - WARN  [main:PostgresqlManager@121] - option to exercise a postgresql-specific fast path.2017-12-28 11:24:50,495 [myid:] - INFO  [main:ImportJobBase@235] - Beginning import of rhnpackagefile2017-12-28 11:24:50,496 [myid:] - INFO  [main:Configuration@1019] - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address2017-12-28 11:24:50,634 [myid:] - INFO  [main:Configuration@1019] - mapred.jar is deprecated. Instead, use mapreduce.job.jar2017-12-28 11:24:51,160 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps2017-12-28 11:24:51,506 [myid:] - INFO  [main:TimelineClientImpl@123] - Timeline service address: http://hdp01:8188/ws/v1/timeline/2017-12-28 11:24:51,696 [myid:] - INFO  [main:AHSProxy@42] - Connecting to Application History server at hdp01.thinkjoy.tt/192.168.120.96:102012017-12-28 11:24:53,801 [myid:] - INFO  [main:DBInputFormat@192] - Using read commited transaction isolation2017-12-28 11:24:53,805 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps2017-12-28 11:24:53,805 [myid:] - INFO  [main:DataDrivenDBInputFormat@147] - BoundingValsQuery: SELECT MIN("modified"), MAX("modified") FROM "rhnpackagefile"2017-12-28 11:25:14,854 [myid:] - WARN  [main:TextSplitter@64] - Generating splits for a textual index column.2017-12-28 11:25:14,854 [myid:] - WARN  [main:TextSplitter@65] - If your database sorts in a case-insensitive order, this may result in a partial import or duplicate records.2017-12-28 11:25:14,854 [myid:] - WARN  [main:TextSplitter@67] - You are strongly encouraged to choose an integral split column.2017-12-28 11:25:14,903 [myid:] - INFO  [main:JobSubmitter@396] - number of splits:62017-12-28 11:25:14,997 [myid:] - INFO  [main:JobSubmitter@479] - Submitting tokens for job: job_1514358672274_00092017-12-28 11:25:15,453 [myid:] - INFO  [main:YarnClientImpl@236] - Submitted application application_1514358672274_00092017-12-28 11:25:15,485 [myid:] - INFO  [main:Job@1289] - The url to track the job: http://hdp01:8088/proxy/application_1514358672274_0009/2017-12-28 11:25:15,486 [myid:] - INFO  [main:Job@1334] - Running job: job_1514358672274_00092017-12-28 11:25:24,763 [myid:] - INFO  [main:Job@1355] - Job job_1514358672274_0009 running in uber mode : false2017-12-28 11:25:24,764 [myid:] - INFO  [main:Job@1362] -  map 0% reduce 0%2017-12-28 11:26:00,465 [myid:] - INFO  [main:Job@1362] -  map 17% reduce 0%2017-12-28 11:26:01,625 [myid:] - INFO  [main:Job@1362] -  map 50% reduce 0%2017-12-28 11:26:03,643 [myid:] - INFO  [main:Job@1362] -  map 83% reduce 0%2017-12-28 11:34:22,028 [myid:] - INFO  [main:Job@1362] -  map 100% reduce 0%2017-12-28 11:34:22,035 [myid:] - INFO  [main:Job@1373] - Job job_1514358672274_0009 completed successfully2017-12-28 11:34:22,162 [myid:] - INFO  [main:Job@1380] - Counters: 31        File System Counters                FILE: Number of bytes read=0                FILE: Number of bytes written=860052                FILE: Number of read operations=0                FILE: Number of large read operations=0                FILE: Number of write operations=0                HDFS: Number of bytes read=913                HDFS: Number of bytes written=3985558014                HDFS: Number of read operations=24                HDFS: Number of large read operations=0                HDFS: Number of write operations=12        Job Counters                 Killed map tasks=1                Launched map tasks=7                Other local map tasks=7                Total time spent by all maps in occupied slots (ms)=1208611                Total time spent by all reduces in occupied slots (ms)=0                Total time spent by all map tasks (ms)=1208611                Total vcore-seconds taken by all map tasks=1208611                Total megabyte-seconds taken by all map tasks=4331661824        Map-Reduce Framework                Map input records=18680041                Map output records=18680041                Input split bytes=913                Spilled Records=0                Failed Shuffles=0                Merged Map outputs=0                GC time elapsed (ms)=4453                CPU time spent (ms)=180780                Physical memory (bytes) snapshot=1957969920                Virtual memory (bytes) snapshot=30116270080                Total committed heap usage (bytes)=1611661312        File Input Format Counters                 Bytes Read=0        File Output Format Counters                 Bytes Written=39855580142017-12-28 11:34:22,170 [myid:] - INFO  [main:ImportJobBase@184] - Transferred 3.7118 GB in 571.0001 seconds (6.6566 MB/sec)2017-12-28 11:34:22,174 [myid:] - INFO  [main:ImportJobBase@186] - Retrieved 18680041 records.2017-12-28 11:34:22,215 [myid:] - INFO  [main:SqlManager@757] - Executing SQL statement: SELECT t.* FROM "rhnpackagefile" AS t LIMIT 12017-12-28 11:34:22,245 [myid:] - INFO  [main:HiveImport@194] - Loading uploaded data into Hive2017-12-28 11:34:28,609 [myid:] - INFO  [Thread-98:LoggingAsyncSink$LoggingThread@85] - 2017-12-28 11:34:28,609 [myid:] - INFO  [Thread-98:LoggingAsyncSink$LoggingThread@85] - Logging initialized using configuration in jar:file:/u01/hive/lib/hive-common-2.3.2.jar!/hive-log4j2.properties Async: true2017-12-28 11:34:31,619 [myid:] - INFO  [Thread-98:LoggingAsyncSink$LoggingThread@85] - OK2017-12-28 11:34:31,622 [myid:] - INFO  [Thread-98:LoggingAsyncSink$LoggingThread@85] - Time taken: 1.666 seconds2017-12-28 11:34:32,026 [myid:] - INFO  [Thread-98:LoggingAsyncSink$LoggingThread@85] - Loading data to table hivedb.rhnpackagefile2017-12-28 11:36:14,783 [myid:] - INFO  [Thread-98:LoggingAsyncSink$LoggingThread@85] - OK2017-12-28 11:36:14,908 [myid:] - INFO  [Thread-98:LoggingAsyncSink$LoggingThread@85] - Time taken: 103.285 seconds2017-12-28 11:36:15,363 [myid:] - INFO  [main:HiveImport@242] - Hive import complete.2017-12-28 11:36:15,372 [myid:] - INFO  [main:HiveImport@278] - Export directory is contains the _SUCCESS file only, removing the directory.

2.2 MySQL☞HDFS

[hadoop@hdp01 ~]$ sqoop import --connect jdbc:mysql://192.168.120.92:3306/smsqw --username smsqw -P --table tbDest --columns iMsgID,cDest,tTime,cSMID,iReSend,tLastProcess,cEnCode,tCreateDT,iNum,iResult,iPriority,iPayment,cState,tGpTime --split-by tGpTime --target-dir /user/DataSource/MySQL/tbDest2017-12-28 14:36:52,550 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6Enter password: 2017-12-28 14:36:55,496 [myid:] - INFO  [main:MySQLManager@69] - Preparing to use a MySQL streaming resultset.2017-12-28 14:36:55,497 [myid:] - INFO  [main:CodeGenTool@92] - Beginning code generationThu Dec 28 14:36:55 CST 2017 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.2017-12-28 14:36:56,233 [myid:] - INFO  [main:SqlManager@757] - Executing SQL statement: SELECT t.* FROM `tbDest` AS t LIMIT 12017-12-28 14:36:56,253 [myid:] - INFO  [main:SqlManager@757] - Executing SQL statement: SELECT t.* FROM `tbDest` AS t LIMIT 12017-12-28 14:36:56,260 [myid:] - INFO  [main:CompilationManager@94] - HADOOP_MAPRED_HOME is /u01/hadoopNote: /tmp/sqoop-hadoop/compile/4a4024e6b2baa336939a9310f627636a/tbDest.java uses or overrides a deprecated API.Note: Recompile with -Xlint:deprecation for details.2017-12-28 14:36:57,637 [myid:] - INFO  [main:CompilationManager@330] - Writing jar file: /tmp/sqoop-hadoop/compile/4a4024e6b2baa336939a9310f627636a/tbDest.jar2017-12-28 14:36:57,650 [myid:] - WARN  [main:MySQLManager@107] - It looks like you are importing from mysql.2017-12-28 14:36:57,650 [myid:] - WARN  [main:MySQLManager@108] - This transfer can be faster! Use the --direct2017-12-28 14:36:57,650 [myid:] - WARN  [main:MySQLManager@109] - option to exercise a MySQL-specific fast path.2017-12-28 14:36:57,650 [myid:] - INFO  [main:MySQLManager@189] - Setting zero DATETIME behavior to convertToNull (mysql)2017-12-28 14:36:57,652 [myid:] - INFO  [main:ImportJobBase@235] - Beginning import of tbDest2017-12-28 14:36:57,653 [myid:] - INFO  [main:Configuration@1019] - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address2017-12-28 14:36:57,820 [myid:] - INFO  [main:Configuration@1019] - mapred.jar is deprecated. Instead, use mapreduce.job.jar2017-12-28 14:36:58,229 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps2017-12-28 14:36:58,581 [myid:] - INFO  [main:TimelineClientImpl@123] - Timeline service address: http://hdp01:8188/ws/v1/timeline/2017-12-28 14:36:58,770 [myid:] - INFO  [main:AHSProxy@42] - Connecting to Application History server at hdp01.thinkjoy.tt/192.168.120.96:10201Thu Dec 28 14:37:01 CST 2017 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.2017-12-28 14:37:01,123 [myid:] - INFO  [main:DBInputFormat@192] - Using read commited transaction isolation2017-12-28 14:37:01,124 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps2017-12-28 14:37:01,124 [myid:] - INFO  [main:DataDrivenDBInputFormat@147] - BoundingValsQuery: SELECT MIN(`tGpTime`), MAX(`tGpTime`) FROM `tbDest`2017-12-28 14:37:17,446 [myid:] - INFO  [main:JobSubmitter@396] - number of splits:42017-12-28 14:37:17,541 [myid:] - INFO  [main:JobSubmitter@479] - Submitting tokens for job: job_1514358672274_00122017-12-28 14:37:17,966 [myid:] - INFO  [main:YarnClientImpl@236] - Submitted application application_1514358672274_00122017-12-28 14:37:17,996 [myid:] - INFO  [main:Job@1289] - The url to track the job: http://hdp01:8088/proxy/application_1514358672274_0012/2017-12-28 14:37:17,996 [myid:] - INFO  [main:Job@1334] - Running job: job_1514358672274_00122017-12-28 14:37:26,149 [myid:] - INFO  [main:Job@1355] - Job job_1514358672274_0012 running in uber mode : false2017-12-28 14:37:26,150 [myid:] - INFO  [main:Job@1362] -  map 0% reduce 0%2017-12-28 14:39:52,733 [myid:] - INFO  [main:Job@1362] -  map 25% reduce 0%2017-12-28 14:40:14,978 [myid:] - INFO  [main:Job@1362] -  map 75% reduce 0%2017-12-28 14:40:43,183 [myid:] - INFO  [main:Job@1362] -  map 100% reduce 0%2017-12-28 14:40:43,191 [myid:] - INFO  [main:Job@1373] - Job job_1514358672274_0012 completed successfully2017-12-28 14:40:43,321 [myid:] - INFO  [main:Job@1380] - Counters: 31        File System Counters                FILE: Number of bytes read=0                FILE: Number of bytes written=573248                FILE: Number of read operations=0                FILE: Number of large read operations=0                FILE: Number of write operations=0                HDFS: Number of bytes read=609                HDFS: Number of bytes written=5399155888                HDFS: Number of read operations=16                HDFS: Number of large read operations=0                HDFS: Number of write operations=8        Job Counters                 Killed map tasks=2                Launched map tasks=6                Other local map tasks=6                Total time spent by all maps in occupied slots (ms)=724670                Total time spent by all reduces in occupied slots (ms)=0                Total time spent by all map tasks (ms)=724670                Total vcore-seconds taken by all map tasks=724670                Total megabyte-seconds taken by all map tasks=2597217280        Map-Reduce Framework                Map input records=31037531                Map output records=31037531                Input split bytes=609                Spilled Records=0                Failed Shuffles=0                Merged Map outputs=0                GC time elapsed (ms)=3675                CPU time spent (ms)=588590                Physical memory (bytes) snapshot=4045189120                Virtual memory (bytes) snapshot=20141694976                Total committed heap usage (bytes)=1943535616        File Input Format Counters                 Bytes Read=0        File Output Format Counters                 Bytes Written=53991558882017-12-28 14:40:43,329 [myid:] - INFO  [main:ImportJobBase@184] - Transferred 5.0284 GB in 225.0893 seconds (22.8755 MB/sec)2017-12-28 14:40:43,335 [myid:] - INFO  [main:ImportJobBase@186] - Retrieved 31037531 records.

2.3 HDFS☞MySQL

[hadoop@hdp01 ~]$ sqoop export --connect jdbc:mysql://192.168.120.92:3306/smsqw?useSSL=false --username smsqw -P --table tbDest2 --export-dir /user/DataSource/MySQL/tbDest2017-12-28 16:03:18,922 [myid:] - INFO  [main:Sqoop@92] - Running Sqoop version: 1.4.6Enter password: 2017-12-28 16:03:21,934 [myid:] - INFO  [main:MySQLManager@69] - Preparing to use a MySQL streaming resultset.2017-12-28 16:03:21,934 [myid:] - INFO  [main:CodeGenTool@92] - Beginning code generation2017-12-28 16:03:22,343 [myid:] - INFO  [main:SqlManager@757] - Executing SQL statement: SELECT t.* FROM `tbDest2` AS t LIMIT 12017-12-28 16:03:22,365 [myid:] - INFO  [main:SqlManager@757] - Executing SQL statement: SELECT t.* FROM `tbDest2` AS t LIMIT 12017-12-28 16:03:22,373 [myid:] - INFO  [main:CompilationManager@94] - HADOOP_MAPRED_HOME is /u01/hadoopNote: /tmp/sqoop-hadoop/compile/332a6c4b30e942c56cf7f507cdff5761/tbDest2.java uses or overrides a deprecated API.Note: Recompile with -Xlint:deprecation for details.2017-12-28 16:03:23,752 [myid:] - INFO  [main:CompilationManager@330] - Writing jar file: /tmp/sqoop-hadoop/compile/332a6c4b30e942c56cf7f507cdff5761/tbDest2.jar2017-12-28 16:03:23,762 [myid:] - INFO  [main:ExportJobBase@378] - Beginning export of tbDest22017-12-28 16:03:23,762 [myid:] - INFO  [main:Configuration@1019] - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address2017-12-28 16:03:24,011 [myid:] - INFO  [main:Configuration@1019] - mapred.jar is deprecated. Instead, use mapreduce.job.jar2017-12-28 16:03:24,738 [myid:] - INFO  [main:Configuration@1019] - mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative2017-12-28 16:03:24,742 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative2017-12-28 16:03:24,743 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps2017-12-28 16:03:25,087 [myid:] - INFO  [main:TimelineClientImpl@123] - Timeline service address: http://hdp01:8188/ws/v1/timeline/2017-12-28 16:03:25,269 [myid:] - INFO  [main:AHSProxy@42] - Connecting to Application History server at hdp01.thinkjoy.tt/192.168.120.96:102012017-12-28 16:03:27,400 [myid:] - INFO  [main:FileInputFormat@281] - Total input paths to process : 42017-12-28 16:03:27,406 [myid:] - INFO  [main:FileInputFormat@281] - Total input paths to process : 42017-12-28 16:03:27,484 [myid:] - INFO  [main:JobSubmitter@396] - number of splits:42017-12-28 16:03:27,493 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative2017-12-28 16:03:27,493 [myid:] - INFO  [main:Configuration@1019] - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps2017-12-28 16:03:27,577 [myid:] - INFO  [main:JobSubmitter@479] - Submitting tokens for job: job_1514358672274_00202017-12-28 16:03:28,062 [myid:] - INFO  [main:YarnClientImpl@236] - Submitted application application_1514358672274_00202017-12-28 16:03:28,091 [myid:] - INFO  [main:Job@1289] - The url to track the job: http://hdp01:8088/proxy/application_1514358672274_0020/2017-12-28 16:03:28,092 [myid:] - INFO  [main:Job@1334] - Running job: job_1514358672274_00202017-12-28 16:17:18,663 [myid:] - INFO  [main:Job@1355] - Job job_1514358672274_0020 running in uber mode : false2017-12-28 16:17:18,665 [myid:] - INFO  [main:Job@1362] -  map 0% reduce 0%2017-12-28 16:17:34,148 [myid:] - INFO  [main:Job@1362] -  map 1% reduce 0%2017-12-28 16:17:43,200 [myid:] - INFO  [main:Job@1362] -  map 2% reduce 0%2017-12-28 16:17:55,269 [myid:] - INFO  [main:Job@1362] -  map 3% reduce 0%......2017-12-28 16:40:15,427 [myid:] - INFO  [main:Job@1362] -  map 100% reduce 0%2017-12-28 16:40:32,491 [myid:] - INFO  [main:Job@1373] - Job job_1514358672274_0020 completed successfully2017-12-28 16:40:32,659 [myid:] - INFO  [main:Job@1380] - Counters: 31        File System Counters                FILE: Number of bytes read=0                FILE: Number of bytes written=571960                FILE: Number of read operations=0                FILE: Number of large read operations=0                FILE: Number of write operations=0                HDFS: Number of bytes read=5401517442                HDFS: Number of bytes written=0                HDFS: Number of read operations=70                HDFS: Number of large read operations=0                HDFS: Number of write operations=0        Job Counters                 Launched map tasks=4                Other local map tasks=1                Rack-local map tasks=3                Total time spent by all maps in occupied slots (ms)=4931826                Total time spent by all reduces in occupied slots (ms)=0                Total time spent by all map tasks (ms)=4931826                Total vcore-seconds taken by all map tasks=4931826                Total megabyte-seconds taken by all map tasks=17675664384        Map-Reduce Framework                Map input records=31037531                Map output records=31037531                Input split bytes=2192                Spilled Records=0                Failed Shuffles=0                Merged Map outputs=0                GC time elapsed (ms)=21815                CPU time spent (ms)=1522470                Physical memory (bytes) snapshot=3453595648                Virtual memory (bytes) snapshot=20112125952                Total committed heap usage (bytes)=477102080        File Input Format Counters                 Bytes Read=0        File Output Format Counters                 Bytes Written=02017-12-28 16:40:32,667 [myid:] - INFO  [main:ExportJobBase@301] - Transferred 5.0306 GB in 2,227.9141 seconds (2.3122 MB/sec)2017-12-28 16:40:32,671 [myid:] - INFO  [main:ExportJobBase@303] - Exported 31037531 records.

另附import和export常用参数说明表:

Using Sqoop 1.4.6 With Hadoop 2.7.4
Using Sqoop 1.4.6 With Hadoop 2.7.4

转载地址:http://xmual.baihongyu.com/

你可能感兴趣的文章
Miller-Rabin随机性素数测试算法(Miller_Rabin模板)
查看>>
转eclipse failed to create the java virtual machine
查看>>
研究float的一些好文章
查看>>
我的友情链接
查看>>
TCP/IP(二) —— TCP 概述
查看>>
ROS-Indigo版在Ubuntu上的安装
查看>>
mysql使用笔记:数据类型
查看>>
Spark for Spatial,相关资源
查看>>
oracle数据导入导出
查看>>
Flask-RESTful构建小型REST服务
查看>>
LB集群--LVS部署
查看>>
lduan server 2012 Web Farm(二十八)
查看>>
用rabbitMQ实现生产者消费者
查看>>
C语言的文件的操作(一)
查看>>
MyBatis --执行insert后获取主键
查看>>
AIX磁带备份
查看>>
ELK 6.4 实时日志分析系统
查看>>
jQuery EasyUI使用教程之动态改变数据网格列
查看>>
Zend Studio使用教程之在Linux上进行安装
查看>>
linux下上传本地文件至github
查看>>