.. | ||
cuckoo-filter | ||
Readme.md |
Cuckoo For Coca Puffs
In this tutorial we expose a Rust cuckoofilter crate as a Fluence service, deploy it to the Fluence network and use that service in a stylized frontend Rust app.
The Elusive Cuckoo Filter
The Cuckoo filter is a probabilistic data structure just like bloom filters but better; better, because we can not just add but also delete keys from the filter. How 'bout that. Quick note on membership tests, bloom filters and probabilities: A bloom, and cuckoo, filter definitively indicate set exclusion, e.g., item is not in filter, and probabilistically indicate set inclusion. For an awesome overview and interactive tutorial, checkout Bloom Filters By Example.
Most Ethereum developers are familiar with bloom filters as every time a block is forged, the address of every logging contract and associated indexed fields from the logs generated by the executed transactions are added to a bloom filter, which is added to the block header. See the Yellow paper for more info.
Cuckoo Filter as a Fluence Service
Aside from the fact that cuckoo filters (CF) may be part of your distributed workflow and a service implementation comes in more than handy, there is another reason why a CF as a Service is useful: CF implementations tend to not follow a particular implementation standard and consequently are implementation specific. This makes makes sharing or re-using of filters challenging. CF as Service greatly alleviates these issues.
Getting Started
We are assuming that you have had the opportunity to work through the Fluence documentation and have it handy as a reference when necessary.
Rather than code our own cuckoo filter, we use the awesome cuckoofilter crate as our starting point and write the wrapper functions in our main.rs. This turns out to be a pretty straight forward process and crate functionality nicely maps into our Fluence module except for some type limitations in the exposed function, i.e. #[fce]
functions.
Rust is strongly typed, which is reflected in its hashing, and that doesn't fully map into our module. For example, H(5_u32) != H(5_64) whereas the respective byte arrays and associated hashes are. Due to the lack of generics in WASI, we lose some of the fine-grained discrimination available in the native crate. However, this is a small price to pay for interoperability gains made.
Now that we have our module in place, we compile our code into a wasm module with Fluence fce
command line tool and use build.sh
to do so. If you haven't installed fce, or it's been a while:
cargo +nightly install fcli --force
and proceed to run the build
script:
./build.sh
Let's unpack the script:
#!/usr/bin/env bash
set -o errexit -o nounset -o pipefail
mkdir -p artifacts
cd fce-cuckoo
cargo update
fce build --release
cd ..
rm -f artifacts/*
cp fce-cuckoo/target/wasm32-wasi/release/fce-cuckoo.wasm artifacts/
First, we create a new dir, artifacts
, that serves as a convenience parking lot for our wasm modules. We then execute the typical cargo update
followed by the fce build --release
. The fce cli closely follows cargo and we just build the wasm module with the release flag. Finally, we copy the wasm module from the deep recesses of the compiler target directory tree to the much more convenient artifacts
directory.
Before we proceed, we need to create a service configuration. That is, we need to specify a few attributes defining our service. This is done in the Config.tom file. For our purposes, we have simple specification attributes limited to name and logging. See the reference for more advanced configurations.
Now that we got our cuckoo filter wasm module and service configuration, we can explore and test our masterpiece locally using the Fluence FCE repl. In the fce-cuckoo
dir, fire up the repl with fce-repl Config.toml
which gets us to the command line:
Welcome to the FCE REPL (version 0.1.33)
app service was created with service id = 80580519-9da6-477c-8265-0eb27d1f89cc
elapsed time 166.494711ms
1>
The first ting to do is check that all our (external) interfaces are available:
1> interface
Loaded modules interface:
fce-cuckoo:
fn create_and_add_cf(data: Array<Array<U8>>) -> String
fn is_empty(cf: String) -> I32
fn memory_usage(cf: String) -> U64
fn service_info() -> String
fn delete(cf: String, items: Array<Array<U8>>) -> Array<I32>
fn create_cf(with_capacity: U32) -> String
fn contains(cf: String, items: Array<Array<U8>>) -> Array<I32>
fn len(cf: String) -> U64
fn add(data: Array<Array<U8>>) -> String
Looks like we're all good to go and we can now run each of those functions with the appropriate signature parameters using the <command, module name, function name, function parameter> syntax. For example,
2> call fce-cuckoo service_info []
result: String("{\"name\":\"Cuckoo Filter\",\"package\":\"https://crates.io/crates/cuckoofilter\",\"source\":\"https://github.com/axiomhq/rust-cuckoofilter\",\"version\":\"0.5.0\"}")
elapsed time: 158.616µs
We can also explore each functions environment variables with the envs
command:
Environment variables:
tmp=/var/folders/yq/fvkl2sbd14sc4_kt00pqk76r0000gn/T/80580519-9da6-477c-8265-0eb27d1f89cc/tmp
local=/var/folders/yq/fvkl2sbd14sc4_kt00pqk76r0000gn/T/80580519-9da6-477c-8265-0eb27d1f89cc/local
service_id=80580519-9da6-477c-8265-0eb27d1f89cc
Simply type help
on the repl command line to see all features available.
From Local Module To Deployed Service
For our purposes, we just want fce-cuckoo to be a granular, self-contained service and it's time to deploy it to the network. In order to manage the distribution process, we need the Fluence fldist
tool. If you have not installed it:
npm i @fluencelabs/fldist -g
To recap from the documentation: Creating a Fluence service is essentially a three-step process: upload the wasm module(s), create and upload a blueprint, which contains all the information required for a service to be created, and the service instantiation. We can use fldist upload
, fldist add_blueprint
, and finally fldist create_service
to sequentially accomplish these tasks. Or, we can use fldist new_service
to combine all three steps. But before we go there, we need our (deployment) seed, which is a Base58 derivation from a private key. The fldist
tool has a convenience function to help us out:
mbp16~/localdev/lw3d/fluence-cuckoo(main|●1…) % fldist create_keypair
{
id: '12D3KooWRKibxAS9NmdXcJ95GYc5CU25UTw8ABzfgNtsHkwHLnHm',
privKey: 'CAESYFf+d8V7XNXdWp1/8Lt3+beXImJP/8bYDZ6do0yBu6ur5mRAupQQGFayTLgJAhafw/zIv/9qJBjD4D6bgZdWZZjmZEC6lBAYVrJMuAkCFp/D/Mi//2okGMPgPpuBl1ZlmA==',
pubKey: 'CAESIOZkQLqUEBhWsky4CQIWn8P8yL//aiQYw+A+m4GXVmWY',
seed: '6vVXJFGhmDk3h58aGNzrGxuoK9jvrfYax1rCBJaNDnUi'
}
Take note of the keys and seed and keep them safe. Now that we have our seed, we can create a Fluence service.
We also need a service configuration file, which is trivial in our case, see cuckoo_cfg.json and merely specifies the service name:
{
"name": "fce-cuckoo"
}
Almost there. We want a name for our blueprint, which should be a UUID. You can generate a valid uuid anyway you want including the nifty uuidgen
:
mbp16~/localdev/lw3d/fluence-cuckoo/fce-cuckoo(main|●1…) % uuidgen
CD610F03-D631-4F28-B22F-AFC637373626
We're finally ready to deploy our service:
mbp16~/localdev/lw3d/fluence-cuckoo/fce-cuckoo(main|●1…) % fldist new_service -n CD610F03-D631-4F28-B22F-AFC637373626 --ms artifacts/fce-cuckoo.wasm:cuckoo_cfg.json -s
6vVXJFGhmDk3h58aGNzrGxuoK9jvrfYax1rCBJaNDnUi --env testnet
client seed: 6vVXJFGhmDk3h58aGNzrGxuoK9jvrfYax1rCBJaNDnUi
client peerId: 12D3KooWRKibxAS9NmdXcJ95GYc5CU25UTw8ABzfgNtsHkwHLnHm
node peerId: 12D3KooWBUJifCTgaxAUrcM9JysqCcS4CS8tiYH5hExbdWCAoNwb
uploading blueprint CD610F03-D631-4F28-B22F-AFC637373626 to node 12D3KooWBUJifCTgaxAUrcM9JysqCcS4CS8tiYH5hExbdWCAoNwb via client 12D3KooWRKibxAS9NmdXcJ95GYc5CU25UTw8ABzfgNtsHkwHLnHm
creating service d7003ece-2f94-4c44-b814-d3f0f136d526
service id: f3137ae9-e687-443d-be1a-9f20a3894d4a
service created successfully
Awesome. We now got our service on the Fluence testnet. As mentioned earlier, the blueprint name is the uuid we provided and upon service creation, we get bach a service reference, d7003ece-2f94-4c44-b814-d3f0f136d526, and a service id, f3137ae9-e687-443d-be1a-9f20a3894d4a, which we need in order to put our cuckoo service to work.
There are different ways to interact with our distributed service but they all go through AIR, the Aquamarine Intermediate Representation. See the air-scripts directory for a few example scripts.
Let's test the service_info
function we reviewed earlier and use the cuckoo_service_info.clj
:
mbp16~/localdev/lw3d/fluence-cuckoo/fce-cuckoo(main|●1…) % fldist run_air -p air-scripts/cuckoo_service_info.clj -d '{"service": "f3137ae9-e687-443d-be1a-9f20a3894d4a"}' --env testnet
client seed: rcxj5V4CxGPqFi4Z4ddQLGYajSKa9mj9Rfi5KqJcLjX
client peerId: 12D3KooWHzbkGB8NjLpTsX7GWu687jsWy3G5Tux5DWfhj7HiYkj8
node peerId: 12D3KooWBUJifCTgaxAUrcM9JysqCcS4CS8tiYH5hExbdWCAoNwb
Particle id: 4f6803dd-efca-4e7e-83a1-7abe465e6e89. Waiting for results... Press Ctrl+C to stop the script.
===================
[
"{\"name\":\"Cuckoo Filter\",\"package\":\"https://crates.io/crates/cuckoofilter\",\"source\":\"https://github.com/axiomhq/rust-cuckoofilter\",\"license\":\"MIT\",\"version\":\"0.5.0\"}"
]
[
[
{
peer_pk: '12D3KooWBUJifCTgaxAUrcM9JysqCcS4CS8tiYH5hExbdWCAoNwb',
service_id: 'f3137ae9-e687-443d-be1a-9f20a3894d4a',
function_name: 'service_info',
json_path: ''
}
]
]
===================
You may have to ctrl-c to end the service.
Calling our remote fn service_info() -> String
with the air script gives us the function return value, as expected:
[
"{\"name\":\"Cuckoo Filter\",\"package\":\"https://crates.io/crates/cuckoofilter\",\"source\":\"https://github.com/axiomhq/rust-cuckoofilter\",\"license\":\"MIT\",\"version\":\"0.5.0\"}"
]
Let's get ourselves a cuckoo filter:
mbp16~/localdev/lw3d/fluence-cuckoo/fce-cuckoo(main|●1…) % fldist run_air -p air-scripts/cuckoo_create_cf.clj -d '{"service": "f3137ae9-e687-443d-be1a-9f20a3894d4a"}' --env testnet
client seed: 76qEx9wTgUweViSCdLMc7Z9tma9AkawGTFWZCKZNER7Z
client peerId: 12D3KooWCeZV2qMyiaVTKUYBQXp5Moxf9gptDFHTDuCZfivZz9Fn
node peerId: 12D3KooWBUJifCTgaxAUrcM9JysqCcS4CS8tiYH5hExbdWCAoNwb
Particle id: 95f65463-f211-469a-a9e5-66c4a50e1668. Waiting for results... Press Ctrl+C to stop the script.
===================
[
[
120,
156,
237,
195,
<snip>
185,
31
]
]
[
[
{
peer_pk: '12D3KooWBUJifCTgaxAUrcM9JysqCcS4CS8tiYH5hExbdWCAoNwb',
service_id: 'f3137ae9-e687-443d-be1a-9f20a3894d4a',
function_name: 'create_cf',
json_path: ''
}
]
]
===================
Very cool. We now have the compressed byte representation of a cuckoo filter.
Coming soon: How to use the service from a Rust application.