-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-30161 Add paginated, single-RPC RegionLocator.getRegionLocations(startKey, limit) API for bulk meta-cache warmup #8236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: branch-2
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,6 +21,7 @@ | |
| import java.io.IOException; | ||
| import java.util.List; | ||
| import java.util.stream.Collectors; | ||
| import org.apache.hadoop.hbase.HConstants; | ||
| import org.apache.hadoop.hbase.HRegionLocation; | ||
| import org.apache.hadoop.hbase.TableName; | ||
| import org.apache.hadoop.hbase.util.Pair; | ||
|
|
@@ -130,6 +131,44 @@ default List<HRegionLocation> getRegionLocations(byte[] row) throws IOException | |
| */ | ||
| List<HRegionLocation> getAllRegionLocations() throws IOException; | ||
|
|
||
| /** | ||
| * Bulk lookup of region locations from {@code hbase:meta} in a single RPC, starting at | ||
| * {@code startKey} (region start-key boundary, inclusive) and returning at most {@code limit} | ||
| * regions in start-key order. | ||
| * <p/> | ||
| * The returned list includes all replicas of each region (matching | ||
| * {@link #getAllRegionLocations()}), and the result is also written to the connection's region | ||
| * location cache. | ||
| * <p/> | ||
| * Ordering: regions are returned in ascending region start-key order (the natural order of | ||
| * {@code hbase:meta} rows for a single table). Within each region, replicas are returned in | ||
| * ascending replica-id order (replica 0, then 1, then 2, ...). Split parents and offline regions | ||
| * are filtered out, which may cause a page to contain fewer than {@code limit} regions but never | ||
| * disturbs ordering of the survivors. | ||
| * <p/> | ||
| * To page through all regions of a table, call repeatedly passing | ||
| * {@code last.getRegion().getEndKey()} as the next {@code startKey}, where {@code last} is the | ||
| * final element of the previous response. All replicas of a region share the same | ||
| * {@link RegionInfo}, so the last entry's end key is the correct cursor regardless of which | ||
| * replica it is. Pass {@code null} for the first call. Stop paging when the returned list is | ||
| * empty or when the last region's end key is {@link HConstants#EMPTY_END_ROW} (zero-length) - | ||
| * that signals the end of the table; passing it back in would re-scan from the beginning since by | ||
| * convention an empty start key means "from the first region". | ||
| * <p/> | ||
| * Unlike {@link #getAllRegionLocations()}, this method performs at most one RPC against | ||
| * {@code hbase:meta} per invocation, so its latency is bounded by {@code limit} rather than table | ||
| * size. Suitable for callers that wrap meta lookups in a lock with a fixed timeout, e.g. for bulk | ||
| * region-cache warmup. | ||
| * @param startKey region start-key to begin scanning from (inclusive); {@code null} or empty | ||
| * starts from the first region | ||
| * @param limit maximum number of regions to return; if <= 0, falls back to | ||
| * {@code hbase.meta.scanner.caching} | ||
| * @return up to {@code limit} {@link HRegionLocation}s in start-key order, possibly empty when no | ||
| * more regions exist | ||
| * @throws IOException if a remote or network exception occurs | ||
| */ | ||
| List<HRegionLocation> getRegionLocations(byte[] startKey, int limit) throws IOException; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since this is changing a public interface on a stable branch, this will break external code implementing the interface, why not have a default implementation perhaps throw unsupported or return a blank list?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, this is overloading an existing methods that takes a data row to find its locations, which is a totally different semantic and so reusing that name for this can be very confusing. Perhaps just call this |
||
|
|
||
| /** | ||
| * Gets the starting row key for every region in the currently open table. | ||
| * <p> | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -119,6 +119,12 @@ public List<HRegionLocation> getAllRegionLocations() throws IOException { | |
| return rawLocations; | ||
| } | ||
|
|
||
| @Override | ||
| public List<HRegionLocation> getRegionLocations(byte[] startKey, int limit) throws IOException { | ||
| // No need to page as region locations are already in-memory. | ||
| return getAllRegionLocations(); | ||
| } | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is actually breaking the contract established by the javadoc by always returning all regions.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I was also wondering same. Thanks for bringing this up. Instead of implementing paging here, shall I throw an exception saying use getAllRegionLocations as I don't see advantage of paging in here.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, that works too, as long as you call out the fact that all implementation may not support it and that they should fallback to getAllRegionLocations. |
||
|
|
||
| @Override | ||
| public TableName getName() { | ||
| return tableName; | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this filtering happening? I didn't see any test coverage either. The existing methods don't do any such filtering correct, so is this even needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filtering is already implemented and happens in inside
MetaTableAccessor.DefaultVisitorBase#visit(). Call chain:HRegionLocator#getRegionLocations()->MetaTableAccessor.TableVisitorBase#visit()->MetaTableAccessor.DefaultVisitorBase#visit().Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am reusing the filtering logic so, no test coverage needed.