public class TableTunnel extends Object
TableTunnel.UploadSession 和 TableTunnel.DownloadSession 这两个会话来负责。示例代码(将一张表的数据导入到另一张表):
public class Sample {
private static String accessID = "";
private static String accessKey = "";
private static String odpsURL = "";
private static String tunnelURL = "";
private static String project = "";
private static String table1 = "";
private static String table2 = "";
public static void main(String args[]) {
Account account = new AliyunAccount(accessID, accessKey);
Odps odps = new Odps(account);
odps.setEndpoint(odpsURL);
odps.setDefaultProject(project);
TableTunnel tunnel = new TableTunnel(odps);
tunnel.setEndpoint(tunnelURL);
try {
DownloadSession downloadSession = tunnel.createDownloadSession(project, table1);
long count = downloadSession.getRecordCount();
RecordReader recordReader = downloadSession.openRecordReader(0, count);
Record record;
UploadSession uploadSession = tunnel.createUploadSession(project, table2);
RecordWriter recordWriter = uploadSession.openRecordWriter(0);
while ((record = recordReader.read()) != null) {
recordWriter.write(record);
}
recordReader.close();
recordWriter.close();
uploadSession.commit(new Long[]{0L});
} catch (TunnelException e) {
e.printStackTrace();
} catch (IOException e1) {
e1.printStackTrace();
}
}
}
| Modifier and Type | Class and Description |
|---|---|
class |
TableTunnel.DownloadSession
DownloadSession 表示从 ODPS 表中下载数据的会话,一般通过
TableTunnel来创建。Session ID 是 Session 的唯一标识符,可通过 TableTunnel.DownloadSession.getId() 获取。表中Record总数可通过 TableTunnel.DownloadSession.getRecordCount() 得到,用户可根据 Record 总数来启动并发下载。DownloadSession 通过创建 RecordReader 来完成数据的读取,需指定读取记录的起始位置和数量RecordReader 对应HTTP请求的超时时间为 300S,超时后 service 端会主动关闭。 |
static class |
TableTunnel.DownloadStatus
下载会话的状态
UNKNOWN 未知 NORMAL 正常 CLOSED 关闭 EXPIRED 过期 |
class |
TableTunnel.UploadSession
UploadSession 表示向ODPS表中上传数据的会话,一般通过
TableTunnel来创建。上传 Session 是 INSERT INTO 语义,即对同一张表或 partition 的多个/多次上传 Session 互不影响。 Session ID 是Session的唯一标识符,可通过 TableTunnel.UploadSession.getId() 获取。UploadSession 通过创建 RecordWriter 来完成数据的写入操作。每个 RecordWriter 对应一个 HTTP Request,单个 UploadSession 可创建多个RecordWriter。 创建 RecordWriter 时需指定 block ID,block ID是 RecordWriter 的唯一标识符,取值范围 [0, 20000),单个block上传的数据限制是 100G。 同一 UploadSession 中,使用同一 block ID 多次打开 RecordWriter 会导致覆盖行为,最后一个调用 close() 的 RecordWriter 所上传的数据会被保留。同一RecordWriter实例不能重复调用 close(). RecordWriter 对应的 HTTP Request超时为 120s,若 120s 内没有数据传输,service 端会主动关闭连接。特别提醒,HTTP协议本身有8K buffer。 |
static class |
TableTunnel.UploadStatus
UploadStatus表示当前Upload的状态
UNKNOWN 未知 NORMAL 正常 CLOSING 关闭中 CLOSED 已关闭 CANCELED 已取消 EXPIRED 已过期 CRITICAL 严重错误 |
| Constructor and Description |
|---|
TableTunnel(Odps odps)
构造此类对象
|
| Modifier and Type | Method and Description |
|---|---|
TableTunnel.DownloadSession |
createDownloadSession(String projectName,
String tableName)
在非分区表上创建下载会话
|
TableTunnel.DownloadSession |
createDownloadSession(String projectName,
String tableName,
boolean async)
在非分区表上创建下载会话
|
TableTunnel.DownloadSession |
createDownloadSession(String projectName,
String tableName,
long shardId)
在shard表上创建下载会话
|
TableTunnel.DownloadSession |
createDownloadSession(String projectName,
String tableName,
PartitionSpec partitionSpec)
在分区表上创建下载会话
|
TableTunnel.DownloadSession |
createDownloadSession(String projectName,
String tableName,
PartitionSpec partitionSpec,
boolean async)
在分区表上创建下载会话
|
TableTunnel.DownloadSession |
createDownloadSession(String projectName,
String tableName,
PartitionSpec partitionSpec,
long shardId)
在shard表上创建下载会话
|
TableTunnel.UploadSession |
createUploadSession(String projectName,
String tableName)
在非分区表上创建上传会话
|
TableTunnel.UploadSession |
createUploadSession(String projectName,
String tableName,
PartitionSpec partitionSpec)
在分区表上创建上传会话
|
TableTunnel.DownloadSession |
getDownloadSession(String projectName,
String tableName,
long shardId,
String id)
获得在非分区表上创建的下载会话
|
TableTunnel.DownloadSession |
getDownloadSession(String projectName,
String tableName,
PartitionSpec partitionSpec,
long shardId,
String id)
获得在shard表上创建的下载会话
|
TableTunnel.DownloadSession |
getDownloadSession(String projectName,
String tableName,
PartitionSpec partitionSpec,
String id)
获得在分区表上创建的下载会话
|
TableTunnel.DownloadSession |
getDownloadSession(String projectName,
String tableName,
String id)
获得在非分区表上创建的下载会话
|
TableTunnel.UploadSession |
getUploadSession(String projectName,
String tableName,
PartitionSpec partitionSpec,
String id)
获得在分区表上创建的上传会话
|
TableTunnel.UploadSession |
getUploadSession(String projectName,
String tableName,
PartitionSpec partitionSpec,
String id,
long shares,
long shareId)
获得在分区表的上传会话,且该会话将要使用
TunnelBufferedWriter 进行数据上传。
当有多个这样的会话实例(多进程或多线程)共享会话 ID 时,需要同时声明此会话实例的唯一标识(shareId)和共享的会话实例个数(shares)。 |
TableTunnel.UploadSession |
getUploadSession(String projectName,
String tableName,
String id)
获得在非分区表上创建的上传会话
|
TableTunnel.UploadSession |
getUploadSession(String projectName,
String tableName,
String id,
long shares,
long shareId)
获得在非分区表的上传会话,且该会话将要使用
TunnelBufferedWriter 进行数据上传。
当有多个这样的会话实例(多进程或多线程)共享会话 ID 时,需要同时声明此会话实例的唯一标识(shareId)和共享的会话实例个数(shares)。 |
void |
setEndpoint(String endpoint)
设置TunnelServer地址
|
public TableTunnel.UploadSession createUploadSession(String projectName, String tableName) throws TunnelException
projectName - Project名称tableName - 表名,非视图TableTunnel.UploadSessionTunnelExceptionpublic TableTunnel.UploadSession createUploadSession(String projectName, String tableName, PartitionSpec partitionSpec) throws TunnelException
注: 分区必须为最末级分区,如表有两级分区pt,ds, 则必须全部指定值, 不支持只指定其中一个值
projectName - Project名tableName - 表名,非视图partitionSpec - 指定分区 PartitionSpecTableTunnel.UploadSessionTunnelExceptionpublic TableTunnel.UploadSession getUploadSession(String projectName, String tableName, String id, long shares, long shareId) throws TunnelException
TunnelBufferedWriter 进行数据上传。
当有多个这样的会话实例(多进程或多线程)共享会话 ID 时,需要同时声明此会话实例的唯一标识(shareId)和共享的会话实例个数(shares)。
final String sid = ""; Thread t1 = new Thread() {
projectName - Project名tableName - 表名,非视图id - 上传会话的ID TableTunnel.UploadSession.getId()shares - 有多少个 UploadSession 实例共享这个会话 IDshareId - 此 UploadSession 的唯一标识,建议为 0 开始的正整数TableTunnel.UploadSessionTunnelExceptionpublic TableTunnel.UploadSession getUploadSession(String projectName, String tableName, PartitionSpec partitionSpec, String id, long shares, long shareId) throws TunnelException
TunnelBufferedWriter 进行数据上传。
当有多个这样的会话实例(多进程或多线程)共享会话 ID 时,需要同时声明此会话实例的唯一标识(shareId)和共享的会话实例个数(shares)。projectName - Project名tableName - 表名,非视图partitionSpec - 指定分区 PartitionSpecid - 上传会话的ID TableTunnel.UploadSession.getId()shares - 有多少个 UploadSession 实例共享这个会话 IDshareId - 此 UploadSession 的唯一标识,建议为 0 开始的正整数TableTunnel.UploadSessionTunnelExceptionpublic TableTunnel.UploadSession getUploadSession(String projectName, String tableName, String id) throws TunnelException
projectName - Project名tableName - 表名,非视图id - 上传会话的ID TableTunnel.UploadSession.getId()TableTunnel.UploadSessionTunnelExceptionpublic TableTunnel.UploadSession getUploadSession(String projectName, String tableName, PartitionSpec partitionSpec, String id) throws TunnelException
projectName - Project名tableName - 表名,非视图partitionSpec - 上传数据表的partition描述 PartitionSpecid - 上传会话ID TableTunnel.UploadSession.getId()TableTunnel.UploadSessionTunnelExceptionpublic TableTunnel.DownloadSession createDownloadSession(String projectName, String tableName) throws TunnelException
projectName - Project名称tableName - 表名,非视图TableTunnel.DownloadSessionTunnelExceptionpublic TableTunnel.DownloadSession createDownloadSession(String projectName, String tableName, boolean async) throws TunnelException
projectName - Project名称tableName - 表名,非视图async - 异步创建session,小文件多的场景下可以避免连接超时的问题TableTunnel.DownloadSessionTunnelExceptionpublic TableTunnel.DownloadSession createDownloadSession(String projectName, String tableName, PartitionSpec partitionSpec) throws TunnelException
projectName - Project名tableName - 表名,非视图partitionSpec - 指定分区 PartitionSpecTableTunnel.DownloadSessionTunnelExceptionpublic TableTunnel.DownloadSession createDownloadSession(String projectName, String tableName, PartitionSpec partitionSpec, boolean async) throws TunnelException
projectName - Project名tableName - 表名,非视图partitionSpec - 指定分区 PartitionSpecasync - 异步创建session,小文件多的场景下可以避免连接超时的问题TableTunnel.DownloadSessionTunnelExceptionpublic TableTunnel.DownloadSession createDownloadSession(String projectName, String tableName, long shardId) throws TunnelException
projectName - Project名tableName - 表名,非视图shardId - 指定shardIdTableTunnel.DownloadSessionTunnelExceptionpublic TableTunnel.DownloadSession createDownloadSession(String projectName, String tableName, PartitionSpec partitionSpec, long shardId) throws TunnelException
projectName - Project名tableName - 表名,非视图partitionSpec - 指定分区 PartitionSpecshardId - 指定shardIsTableTunnel.DownloadSessionTunnelExceptionpublic TableTunnel.DownloadSession getDownloadSession(String projectName, String tableName, String id) throws TunnelException
projectName - Project名tableName - 表名,非视图id - 下载会话ID TableTunnel.DownloadSession.getId()TableTunnel.DownloadSessionTunnelExceptionpublic TableTunnel.DownloadSession getDownloadSession(String projectName, String tableName, long shardId, String id) throws TunnelException
projectName - Project名tableName - 表名,非视图id - 下载会话ID TableTunnel.DownloadSession.getId()TableTunnel.DownloadSessionTunnelExceptionpublic TableTunnel.DownloadSession getDownloadSession(String projectName, String tableName, PartitionSpec partitionSpec, String id) throws TunnelException
projectName - Project名tableName - 表名,非视图partitionSpec - 指定分区 PartitionSpecid - 下载会话ID TableTunnel.DownloadSession.getId()TableTunnel.DownloadSessionTunnelExceptionpublic TableTunnel.DownloadSession getDownloadSession(String projectName, String tableName, PartitionSpec partitionSpec, long shardId, String id) throws TunnelException
projectName - Project名tableName - 表名,非视图partitionSpec - 指定分区 PartitionSpecshardId - 指定shardIdid - 下载会话ID TableTunnel.DownloadSession.getId()TableTunnel.DownloadSessionTunnelExceptionpublic void setEndpoint(String endpoint)
没有设置TunnelServer地址的情况下, 自动选择
endpoint - Copyright © 2019 Alibaba Cloud Computing. All rights reserved.