Add RangeScan() support #160

Merged
prologic merged 7 commits from range_scan into master 2 years ago
Owner
There is no content yet.
awnumar (Migrated from github.com) reviewed 3 years ago
bitcask.go Outdated
awnumar (Migrated from github.com) commented 3 years ago
Poster
Owner

Are keys stored in an ordered way? This implementation requires iterating over every key and checking if it lies in some range. If there are lots of keys this becomes expensive.

In any case it seems like this function is only useful for situations where keys are intrinsically ordered and have some kind of "meaning"? I may be wrong. What do you imagine the use-case to be?

Are keys stored in an ordered way? This implementation requires iterating over _every_ key and checking if it lies in some range. If there are lots of keys this becomes expensive. In any case it seems like this function is only useful for situations where keys are intrinsically ordered and have some kind of "meaning"? I may be wrong. What do you imagine the use-case to be?
prologic marked this conversation as resolved
prologic reviewed 3 years ago
bitcask.go Outdated
Poster
Owner

The underlying trie is ordered AFAIK - yes. So does your comment still apply re the "expensve" Maybe this is better moved to a fork of go-adaptive-radix-tree?

The underlying trie is ordered AFAIK - yes. So does your comment still apply re the "expensve" Maybe this is better moved to a fork of [go-adaptive-radix-tree](github.com/plar/go-adaptive-radix-tree)?
prologic marked this conversation as resolved
awnumar (Migrated from github.com) reviewed 3 years ago
bitcask.go Outdated
awnumar (Migrated from github.com) commented 3 years ago
Poster
Owner

@prologic Maybe it's possible to range over only those keys within the specified range? Since they are ordered.

@prologic Maybe it's possible to range over only those keys within the specified range? Since they are ordered.
prologic marked this conversation as resolved
prologic reviewed 3 years ago
bitcask.go Outdated
Poster
Owner

I think it is; but not with the current API adaptive-radix-tree exports. As I said we would have to fork it and add the funcitonlity there. I'll look into doing that; but it may take some time :)

I _think_ it is; but not with the current API adaptive-radix-tree exports. As I said we would have to fork it and add the funcitonlity there. I'll look into doing that; but it may take some time :)
prologic marked this conversation as resolved
github-actions[bot] commented 3 years ago (Migrated from github.com)
Poster
Owner

This Pull Request has gone stale and will be closed automatically.

If this is incorrect, please reopen and add the label on hold.

Thank you.

This Pull Request has gone stale and will be closed automatically. If this is incorrect, please reopen and add the label `on hold`. Thank you.
Hassall (Migrated from github.com) requested changes 2 years ago
bitcask.go Outdated
Hassall (Migrated from github.com) commented 2 years ago
Poster
Owner

Any reason to not have the f func(key []byte) consume a value? f func(key, value []byte)

Any reason to not have the `f func(key []byte)` consume a value? `f func(key, value []byte)`
Hassall (Migrated from github.com) commented 2 years ago
Poster
Owner

Range can provide better performance when trying to capture many keys. It should be up to the user to decide which is better in their case.

Search for multiple keys:
Range = O(n)
Has/Get = O(k * log n) where k is the number of operations

Range can provide better performance when trying to capture many keys. It should be up to the user to decide which is better in their case. **Search for multiple keys:** Range = O(n) Has/Get = O(k * log n) *where k is the number of operations*
Hassall (Migrated from github.com) commented 2 years ago
Poster
Owner

throw error if end < start

throw error if end < start
prologic marked this conversation as resolved
bitcask.go Outdated
Hassall (Migrated from github.com) commented 2 years ago
Poster
Owner

Would be nice if we could avoid iterating the trie when our range exceeds the bounds. The trie API only offers a minimum/maximum for the value and not the key (in constant time) .

for example

minKey, val := b.trie.Minimum()
maxKey, val := b.trie.Maximum()

if minKey > end || maxKey < start {
   return
}

(Very easy to make this change however)

Would be nice if we could avoid iterating the trie when our range exceeds the bounds. The trie API only offers a minimum/maximum for the value and **not the key** (in constant time) . for example ```go minKey, val := b.trie.Minimum() maxKey, val := b.trie.Maximum() if minKey > end || maxKey < start { return } ``` (Very easy to make this change however)
prologic marked this conversation as resolved
prologic reviewed 2 years ago
bitcask.go Outdated
Poster
Owner

Done

Done
prologic marked this conversation as resolved
Poster
Owner

I'm not sure what's up with my test for this :/

I'm not sure what's up with my test for this :/
taigrr commented 2 years ago
Collaborator

I can see the usefulness of this. I'll take a look.

I can see the usefulness of this. I'll take a look.
prologic added 2 commits 2 years ago
prologic removed the on hold tests labels 2 years ago
prologic added 1 commit 2 years ago
continuous-integration/drone/pr Build is failing Details
19b528e883
Fix imports
prologic added 1 commit 2 years ago
Poster
Owner

This PR is now ready too!

This PR is now ready too!
taigrr requested changes 2 years ago
bitcask.go Outdated
if commonPrefix == nil {
return ErrInvalidRange
}
taigrr commented 2 years ago
Collaborator

Can we grab + defer() the RLock() here?

Can we grab + defer() the RLock() here?
Poster
Owner

Ahh yes of course! Forgot about that ;)

Ahh yes of course! Forgot about that ;)
prologic marked this conversation as resolved
bitcask.go Outdated
}
b.trie.ForEachPrefix(commonPrefix, func(node art.Node) bool {
if bytes.Compare(node.Key(), start) >= 0 && bytes.Compare(node.Key(), end) <= 0 {
taigrr commented 2 years ago
Collaborator

This might be a place where we can get some extra efficiency by skipping the tail end of the tries.

Something like:

b.trie.ForEachPrefix(commonPrefix, func(node art.Node) bool {
		if bytes.Compare(node.Key(), start) >= 0 && bytes.Compare(node.Key(), end) <= 0 {
        	if err = f(node.Key()); err != nil {
				return false
			}
			return true
        } else if bytes.Compare(node.Key(), start) >= 0 && bytes.Compare(node.Key(), end) > 0 {
        	return false
        }

This might be a place where we can get some extra efficiency by skipping the tail end of the tries. Something like: ``` b.trie.ForEachPrefix(commonPrefix, func(node art.Node) bool { if bytes.Compare(node.Key(), start) >= 0 && bytes.Compare(node.Key(), end) <= 0 { if err = f(node.Key()); err != nil { return false } return true } else if bytes.Compare(node.Key(), start) >= 0 && bytes.Compare(node.Key(), end) > 0 { return false } ```
Poster
Owner

Ahh yes brilliant!

Ahh yes brilliant!
taigrr commented 2 years ago
Collaborator

Whoops, updated the snippet above. Don't use it as it was a minute ago. My bad.

Whoops, updated the snippet above. Don't use it as it was a minute ago. My bad.
Poster
Owner

All good! Should be right now :)

All good! Should be right now :)
taigrr commented 2 years ago
Collaborator

Hmm, I'm not seeing it updated.

Hmm, I'm not seeing it updated.
Poster
Owner

Hang on... I'm slow ;)

Hang on... I'm slow ;)
taigrr marked this conversation as resolved
prologic added 1 commit 2 years ago
continuous-integration/drone/pr Build is failing Details
d1f070b9e0
Address feedback from review
prologic force-pushed range_scan from d1f070b9e0 to 0994438dc8 2 years ago
prologic added 1 commit 2 years ago
continuous-integration/drone/pr Build is passing Details
80ee640d79
Fix skipping tail end of trie
taigrr commented 2 years ago
Collaborator

LGTM

LGTM
Poster
Owner

This Pull Request doesn't have enough approvals yet. 0 of 1 approvals granted.

Wanna approve it? :D

> This Pull Request doesn't have enough approvals yet. 0 of 1 approvals granted. Wanna approve it? :D
taigrr commented 2 years ago
Collaborator

Actually I have one update to this. I want to add a SiftRange(). Almost done.

Actually I have one update to this. I want to add a SiftRange(). Almost done.
Poster
Owner

Do it as a separate PR so you retain authorship :D

Do it as a separate PR so you retain authorship :D
Poster
Owner

Approve and I'll merge this one!

Approve and I'll merge this one!
taigrr added 1 commit 2 years ago
continuous-integration/drone/pr Build is passing Details
4f295057f0
Add SiftRange functionality + tests
taigrr commented 2 years ago
Collaborator

Oh, whoops. I pushed to the wrong remote :|

Oh, whoops. I pushed to the wrong remote :|
taigrr commented 2 years ago
Collaborator

It's a totally different feature though, so it can be in a different commit.

It's a totally different feature though, so it can be in a different commit.
taigrr closed this pull request 2 years ago
taigrr reopened this pull request 2 years ago
taigrr commented 2 years ago
Collaborator

LGTM

LGTM
taigrr commented 2 years ago
Collaborator

I can't formally approve, I don't have permission (I don't think)

I can't formally approve, I don't have permission (I don't think)
Poster
Owner

I can't formally approve, I don't have permission (I don't think)

Pretty sure I gave you access to review/approve hmmm 🤔

> I can't formally approve, I don't have permission (I don't think) Pretty sure I gave you access to review/approve hmmm 🤔
prologic requested review from taigrr 2 years ago
Poster
Owner
There is no content yet.
prologic merged commit 9b0daa8a30 into master 2 years ago
prologic referenced this issue from a commit 2 years ago

Reviewers

taigrr was requested for review 2 years ago
continuous-integration/drone/pr Build is passing
The pull request has been merged as 9b0daa8a30.
Sign in to join this conversation.
Loading…
There is no content yet.