问题背景:需要查询指定条件下的数据信息,但是总是会查询超时,经排查,由于同一个业务表分成了5张分表,总数据量在4千万多,不到5千万。但是如果查询条件中不含有分片键,导致全表扫描,耗时过长。之前看SharedingSphere的官网可以实现强制路由,因此想尝试一下这种机制,指定查询某张分表减少查询耗时。
借着这个问题正好研究一下SharedingSphere的强制路由机制。好,进入正题:
查看官网对强制路由的说明:
我对这个概述的理解就是:可以使用强制路由,即使没有分片键,也可以对某个sql直接路由到某个具体分表中,也不用全表扫描了。
官网示例:
强制路由的实现步骤分解一下,主要分为以下几个部分:
(1)创建数据库以及表的HintAlgorithm类,并实现HintShardingAlgorithm的doSharding方法。
(2)配置HintAlgorithm到TableShardingStrategy中。
(3)使用HintManager在业务代码中调用强制路由逻辑,并且及时关闭强制路由。
注意:需要强调的是,如图,配置强制路由的位置是某张表(t_order)下面的databaseStrategy以及tableStrategy
shardingRule:
tables:
t_order:
actualDataNodes: demo_ds_${0..1}.t_order_${0..1}
databaseStrategy:
# 配置数据库强制路由
hint:
algorithmClassName: io.shardingsphere.userAlgo.DataBaseHintAlgorithm
tableStrategy:
# 配置表强制路由
hint:
algorithmClassName: io.shardingsphere.userAlgo.TableHintAlgorithm
defaultTableStrategy:
none:
defaultKeyGeneratorClassName: io.shardingsphere.core.keygen.DefaultKeyGenerator
props:
sql.show: true
项目技术背景:SpringBoot项目,结合Mybatis,使用MySQL数据库。
现有分表背景:订单表t_order有5张分表,分别为t_order,t_order_1,t_order_2,t_order_3,t_order_4。
分片键为oid,现有的路由逻辑是实现了PreciseShardingAlgorithm类(精确分片算法,用于处理使用单一键作为分片键的=与IN进行分片的场景。需要配合StandardShardingStrategy使用)。
现有路由逻辑是,如果SQL操作中包含oid的条件,那么可以直接路由到对应的表中,但是如果SQL操作中不包含oid的条件,那么只能全表扫描。
分表路由TableHintAlgorithm
@Component("tableHintAlgorithm")
@Slf4j
public class TableHintAlgorithm implements HintShardingAlgorithm<Long> {
/**
*
* @param availableTargetNames 业务表分表集合
* @param shardingValue 当前指定的分表路由
* @return 具体路由表
*/
@Override
public Collection<String> doSharding(
Collection<String> availableTargetNames,
HintShardingValue<Long> shardingValue) {
log.info("TableHintAlgorithm-availableTargetNames:{},-shardingValue:{}" + JSON.toJSONString(availableTargetNames), shardingValue);
// 添加分表路由逻辑
Collection<String> result = new ArrayList<>();
for (String each : availableTargetNames) {
for (Long value : shardingValue.getValues()) {
if (each.contains(String.valueOf(value))) {
result.add(each);
}
}
}
System.out.println("result:" + JSON.toJSONString(result));
return result;
}
}
分库路由DateBaseHintAlgorithm
@Component("dataBaseHintAlgorithm")
@Slf4j
public class DateBaseHintAlgorithm implements HintShardingAlgorithm<Long> {
@Override
public Collection<String> doSharding(
Collection<String> availableTargetNames,
HintShardingValue<Long> shardingValue) {
// 添加分库路由逻辑,直接路由到主库
Collection<String> result = new ArrayList<>();
result.add("ds0");
return result;
}
}
(代码块中只展示了主线代码,其余非主线代码未展示)。
public class OrderCenterDataSourceConfig {
/**
* 初始化数据库配置
*
* @param dataSource 数据源
* @param tableSharding 原有的路由逻辑
* @param tableHintAlgorithm 分表强制路由逻辑
* @param dataBaseHintAlgorithm 分库强制路由逻辑
* @return
* @throws Exception
*/
@Bean
public DataSource shardingDataSource(@Qualifier("orderCenterDataSource") DataSource dataSource,
@Qualifier("tableShardingAlgorithm") TableShardingAlgorithm tableSharding,
@Qualifier("tableHintAlgorithm") TableHintAlgorithm tableHintAlgorithm,
@Qualifier("dataBaseHintAlgorithm") DateBaseHintAlgorithm dataBaseHintAlgorithm) throws Exception {
Map<String, DataSource> dataSourceMap = new HashMap<>(1, 1);
dataSourceMap.put("ds0", dataSource);
//1.分库配置
ShardingRuleConfiguration shardingRuleConfig = new ShardingRuleConfiguration();
shardingRuleConfig.setDefaultDataSourceName("ds0");
//2.分表配置
for (String tableName : StringUtils.split(SHARDING_TABLES, ",")) {
//3.原始路由配置
TableRuleConfiguration orderTableRuleConfig = new TableRuleConfiguration(tableName, "ds0." + tableName);
orderTableRuleConfig.setTableShardingStrategyConfig(new StandardShardingStrategyConfiguration("oid", tableSharding));
shardingRuleConfig.getTableRuleConfigs().add(orderTableRuleConfig);
//4.强制路由配置
TableRuleConfiguration hintTableRuleConfig = new TableRuleConfiguration(tableName, "ds0." + tableName);
hintTableRuleConfig.setTableShardingStrategyConfig(new HintShardingStrategyConfiguration(tableHintAlgorithm));
hintTableRuleConfig.setDatabaseShardingStrategyConfig(new HintShardingStrategyConfiguration(dataBaseHintAlgorithm));
shardingRuleConfig.getTableRuleConfigs().add(hintTableRuleConfig);
}
//5.绑定表规则列表
shardingRuleConfig.getBindingTableGroups().add(SHARDING_TABLES);
//6.属性配置
Properties properties = new Properties();
//7.配置连接模式
properties.put(ConfigurationPropertyKey.MAX_CONNECTIONS_SIZE_PER_QUERY.getKey(), "20");
//8.展示分表SQL
properties.put(ConfigurationPropertyKey.SQL_SHOW.getKey(), true);
//9.创建Sharding的数据库配置
DataSource shardingDadaSource = ShardingDataSourceFactory.createDataSource(dataSourceMap, shardingRuleConfig, properties);
//10.动态加载分表
actualTablesRefresh(shardingDadaSource);
return shardingDadaSource;
}
/**
* 动态加载分表
*/
private void actualTablesRefresh(DataSource shardingDataSource) throws Exception {
ShardingDataSource dataSource = (ShardingDataSource) shardingDataSource;
String[] tableNames = StringUtils.split(SHARDING_TABLES, ",");
for (String tableName : tableNames) {
TableRule tableRule = null;
try {
// 11.获取当前表配置
tableRule = dataSource.getRuntimeContext().getRule().getTableRule(tableName);
} catch (ShardingSphereConfigurationException e) {
// skip
}
if (tableRule == null) {
throw new RuntimeException("动态加载分表异常");
}
//12.表逻辑动态加载,按照分表规则配置actualDataNodes等信息
......
//13.将刷新好的配置加载到数据库配置中
Field datasourceToTablesMapField = TableRule.class.getDeclaredField("datasourceToTablesMap");
datasourceToTablesMapField.setAccessible(true);
datasourceToTablesMapField.set(tableRule, datasourceToTablesMap);
}
}
}
@Test
public void test8() {
//创建HintManager示例
HintManager hintManager = HintManager.getInstance();
//配置分表路由,该配置应该会强制路由到t_order_1分表
hintManager.addTableShardingValue("t_order", 1l);
//配置分库路由
hintManager.addDatabaseShardingValue("t_order", 0);
//SQL操作
Order order = flightOrderMapper.findById(749557987907031040L);
//关闭强制路由
hintManager.close();
}
打印效果:
2022-08-12 15:27:13.903 [ main] DEBUG ( BaseJdbcLogger.java: 143) || : ==> Preparing: SELECT id,oid,create_time,update_time FROM t_order WHERE id = ?
2022-08-12 15:27:13.929 [ main] DEBUG ( BaseJdbcLogger.java: 143) || : ==> Parameters: 123(Long)
2022-08-12 15:27:14.579 [ main] INFO ( SQLLogger.java: 74) || : Logic SQL: SELECT id,oid,create_time,update_time FROM t_order WHERE id = ?
2022-08-12 15:27:14.579 [ main] INFO ( SQLLogger.java: 74) || : SQLStatement: SelectStatementContext(super=CommonSQLStatementContext(sqlStatement=org.apache.shardingsphere.sql.parser.sql.statement.dml.SelectStatement@7e48974f, tablesContext=org.apache.shardingsphere.sql.parser.binder.segment.table.TablesContext@5205f975), tablesContext=org.apache.shardingsphere.sql.parser.binder.segment.table.TablesContext@5205f975, projectionsContext=ProjectionsContext(startIndex=7, stopIndex=543, distinctRow=false, projections=[ColumnProjection(owner=null, name=id, alias=Optional.empty), ColumnProjection(owner=null, name=oid, alias=Optional.empty),ColumnProjection(owner=null, name=create_time, alias=Optional.empty), ColumnProjection(owner=null, name=update_time, alias=Optional.empty)]), groupByContext=org.apache.shardingsphere.sql.parser.binder.segment.select.groupby.GroupByContext@5d39ef56, orderByContext=org.apache.shardingsphere.sql.parser.binder.segment.select.orderby.OrderByContext@967b0a2, paginationContext=org.apache.shardingsphere.sql.parser.binder.segment.select.pagination.PaginationContext@332fac2e, containsSubquery=false)
2022-08-12 15:27:14.580 [ main] INFO ( SQLLogger.java: 74) || : Actual SQL: ds0 ::: SELECT id,oid,create_time,update_time FROM t_order_1 WHERE id = ? ::: [123]
2022-08-12 15:27:14.581 [ main] INFO ( SQLLogger.java: 74) || : Actual SQL: ds0 ::: SELECT id,oid,create_time,update_time FROM t_order_2 WHERE id = ? ::: [123]
2022-08-12 15:27:14.581 [ main] INFO ( SQLLogger.java: 74) || : Actual SQL: ds0 ::: SELECT id,oid,create_time,update_time FROM t_order_3 WHERE id = ? ::: [123]
2022-08-12 15:27:14.581 [ main] INFO ( SQLLogger.java: 74) || : Actual SQL: ds0 ::: SELECT id,oid,create_time,update_time FROM t_order_4 WHERE id = ? ::: [123]
2022-08-12 15:27:14.581 [ main] INFO ( SQLLogger.java: 74) || : Actual SQL: ds0 ::: SELECT id,oid,create_time,update_time FROM t_order WHERE id = ? ::: [123]
2022-08-12 15:27:14.686 [ main] DEBUG ( BaseJdbcLogger.java: 143) || : <== Total: 1
由打印结果可知,并没有强制路由到t_order_1表,而是全表扫描,说明配置的强制路由没生效。
首先,检查了配置的强制路由是否加载到了配置中,看打印日志,是有的
和官网中的配置格式是一样的:
然后,打debug跟踪,主要是看OrderCenterDataSourceConfig类中的第3,4,10步,判断是否有加载当前的强制路由,还有强制路由有没有生效
由图片可知,第3,4步之后,order表的配置中添加了两个规则配置,一个是TableShardingAlgorithm的配置,第二个是强制路由配置。
再接着向下看,第10步的方法中,最终的动态表加载有没有将强制路由配置加载进去。debug到OrderCenterDataSourceConfig的第11步,问题点找到了,原来在第11步,只拿了t_order表的第一个规则,并没有取所有的规则配置。
private void actualTablesRefresh(DataSource shardingDataSource) throws Exception {
ShardingDataSource dataSource = (ShardingDataSource) shardingDataSource;
String[] tableNames = StringUtils.split(SHARDING_TABLES, ",");
for (String tableName : tableNames) {
TableRule tableRule = null;
try {
// 11.获取当前表配置
tableRule = dataSource.getRuntimeContext().getRule().getTableRule(tableName);
} catch (ShardingSphereConfigurationException e) {
// skip
}
if (tableRule == null) {
throw new RuntimeException("动态加载分表异常");
}
//12.表逻辑动态加载,按照分表规则配置actualDataNodes等信息
......
//13.将刷新好的配置加载到数据库配置中
Field datasourceToTablesMapField = TableRule.class.getDeclaredField("datasourceToTablesMap");
datasourceToTablesMapField.setAccessible(true);
datasourceToTablesMapField.set(tableRule, datasourceToTablesMap);
}
}
第11步的源码,根据tableName获取规则的源码,查看源码可知,分表规则默认取第一个。因此,只能配置某一种路由配置。
根据上面问题排查,去掉OrderCenterDataSourceConfig中第3步的操作,只配置强制路由:
/**
* 初始化数据库配置
*
* @param dataSource 数据源
* @param tableSharding 原有的路由逻辑
* @param tableHintAlgorithm 分表强制路由逻辑
* @param dataBaseHintAlgorithm 分库强制路由逻辑
* @return
* @throws Exception
*/
@Bean
public DataSource shardingDataSource(@Qualifier("orderCenterDataSource") DataSource dataSource,
@Qualifier("tableShardingAlgorithm") TableShardingAlgorithm tableSharding,
@Qualifier("tableHintAlgorithm") TableHintAlgorithm tableHintAlgorithm,
@Qualifier("dataBaseHintAlgorithm") DateBaseHintAlgorithm dataBaseHintAlgorithm) throws Exception {
Map<String, DataSource> dataSourceMap = new HashMap<>(1, 1);
dataSourceMap.put("ds0", dataSource);
//1.分库配置
ShardingRuleConfiguration shardingRuleConfig = new ShardingRuleConfiguration();
shardingRuleConfig.setDefaultDataSourceName("ds0");
//2.分表配置
for (String tableName : StringUtils.split(SHARDING_TABLES, ",")) {
//4.强制路由配置
TableRuleConfiguration hintTableRuleConfig = new TableRuleConfiguration(tableName, "ds0." + tableName);
hintTableRuleConfig.setTableShardingStrategyConfig(new HintShardingStrategyConfiguration(tableHintAlgorithm));
hintTableRuleConfig.setDatabaseShardingStrategyConfig(new HintShardingStrategyConfiguration(dataBaseHintAlgorithm));
shardingRuleConfig.getTableRuleConfigs().add(hintTableRuleConfig);
}
//5.绑定表规则列表
shardingRuleConfig.getBindingTableGroups().add(SHARDING_TABLES);
//6.属性配置
Properties properties = new Properties();
//7.配置连接模式
properties.put(ConfigurationPropertyKey.MAX_CONNECTIONS_SIZE_PER_QUERY.getKey(), "20");
//8.展示分表SQL
properties.put(ConfigurationPropertyKey.SQL_SHOW.getKey(), true);
//9.创建Sharding的数据库配置
DataSource shardingDadaSource = ShardingDataSourceFactory.createDataSource(dataSourceMap, shardingRuleConfig, properties);
//10.动态加载分表
actualTablesRefresh(shardingDadaSource);
return shardingDadaSource;
}
再次执行测试示例,根据结果可知,进入了TableHintAlgorithm中,并且最终路由到了t_order_1分表
2022-08-12 16:09:41.008 [ main] DEBUG ( BaseJdbcLogger.java: 143) || : ==> Preparing: SELECT id,oid,create_time,update_time FROM t_order WHERE id = ?
2022-08-12 16:09:41.029 [ main] DEBUG ( BaseJdbcLogger.java: 143) || : ==> Parameters: 123(Long)
2022-08-12 16:09:41.554 [ main] INFO ( TableHintAlgorithm.java: 35) || : TableHintAlgorithm-availableTargetNames:HintShardingValue(logicTableName=t_order, columnName=, values=[1]),-shardingValue:{}["t_order_1","t_order_2","t_order_3","t_order_4","t_order"]
result:["t_order_1"]
2022-08-12 16:09:41.595 [ main] INFO ( SQLLogger.java: 74) || : Logic SQL: SELECT id,oid,create_time,update_time FROM t_order WHERE id = ?
2022-08-12 16:09:41.595 [ main] INFO ( SQLLogger.java: 74) || : SQLStatement: SelectStatementContext(super=CommonSQLStatementContext(sqlStatement=org.apache.shardingsphere.sql.parser.sql.statement.dml.SelectStatement@55c38884, tablesContext=org.apache.shardingsphere.sql.parser.binder.segment.table.TablesContext@81acb6c), tablesContext=org.apache.shardingsphere.sql.parser.binder.segment.table.TablesContext@81acb6c, projectionsContext=ProjectionsContext(startIndex=7, stopIndex=543, distinctRow=false, projections=[ColumnProjection(owner=null, name=id, alias=Optional.empty), ColumnProjection(owner=null, name=oid, alias=Optional.empty), ColumnProjection(owner=null, name=create_time, alias=Optional.empty), ColumnProjection(owner=null, name=update_time, alias=Optional.empty)]), groupByContext=org.apache.shardingsphere.sql.parser.binder.segment.select.groupby.GroupByContext@799c87, orderByContext=org.apache.shardingsphere.sql.parser.binder.segment.select.orderby.OrderByContext@2b3f7a68, paginationContext=org.apache.shardingsphere.sql.parser.binder.segment.select.pagination.PaginationContext@62f15023, containsSubquery=false)
2022-08-12 16:09:41.597 [ main] INFO ( SQLLogger.java: 74) || : Actual SQL: ds0 ::: SELECT id,oid,create_time,update_time FROM t_order_1 WHERE id = ? ::: [123]
2022-08-12 16:09:41.693 [ main] DEBUG ( BaseJdbcLogger.java: 143) || : <== Total: 1
第一个测试:由(4)问题排查可知,一张表只能生效一种配置规则,那么如果配置成了强制路由的话,会影响以分片键查询的逻辑吗?下面用测试代码试一下
@Test
public void test8() {
//oid为表的分片键
Order o1 = flightOrderMapper.findByOrderId("D123456788");
}
由测试结果可知,根据分片键路由到特定表的逻辑失效了,现在的情况是全表扫描。鱼与熊掌不可兼得,只能放弃强制路由的配置,只能从业务上降低查询耗时了。
第二个测试:还可以测试一下如果HintManager在close之前,有多个SQL操作,会不会都走强制路由呢?
@Test
public void test8() {
//创建HintManager示例
HintManager hintManager = HintManager.getInstance();
//配置分表路由
hintManager.addTableShardingValue("t_order", 1l);
//配置分库路由
hintManager.addDatabaseShardingValue("t_order", 0);
//SQL操作
Order order = flightOrderMapper.findById(123);
//SQL 操作
Order o1 = flightOrderMapper.findByOrderId("D221820xxxxxx");
//关闭强制路由
hintManager.close();
}
测试结果:由结果可知,在hintManager实例创建之后到关闭之前,中间的SQL都会走强制路由逻辑。
既然HintManager与现有分片逻辑冲突,于是就放弃了使用HintManager解决耗时过长的问题。
同事给我一个建议,因为我们现在分片键oid是根据创建时间推导出来的,根据查询条件中的创建时间范围推算出来分片键oid的最大最小值,然后范围查找数据,这样数据表就可以根据分片键去路由到指定表了。我进行了相关的测试:
SQL如下:
select * from t_order where oid >= 'D123456' and oid < 'D234567'
我修改了SQL,然后去执行,然后报了如下的错误:当前的策略不支持这个查询。现在的配置的分表规则是实现了PreciseShardingAlgorithm类(精确分片算法,用于处理使用单一键作为分片键的=与IN进行分片的场景。需要配合StandardShardingStrategy使用),并不支持以分片键去做范围查找,如果要实现范围查找,需要实现另外一个类:RangeShardingAlgorithm,用于处理使用单一键作为分片键的BETWEEN AND进行分片的场景。需要配合StandardShardingStrategy使用。因为如果动这个策略的话,直接对主业务线进行了改造,在对比的情况下,因此没有用这个方案
一和二的方案都不可行,最终采用的方案是从业务上约束查询时间,减少查询量,从而减少耗时。