Conversation
Incorporated official announcement from Jan 22, 2026: Completed Milestones: - ✅ M1: AO Core - ✅ M2: Native Execution & TEE Support - ✅ M3: LegacyNet Migration (100x performance gains) M4 Official Features: - Decentralized Schedulers - LiveNet Staking Marketplace - Streaming Token Distributions Added comprehensive branch-to-PR mapping: - 57 open PRs with owners and status - 70+ merged PRs since release - Branch ownership for all active development Key contributors working on M4: - samcamwilliams: Core protocol, native tokens (expr/1.5, feat/native-tokens) - speeddragon: Cryptography, fixes (feat/ecdsa_support, PR permaweb#574) - JamesPiechota: Indexing (feat/arweave-id-offset-indexing, PR permaweb#616) - noahlevenson: Security testing (impr/secure-actions) - PeterFarber: TEE attestation (feat/c_snp)
… (i.e. true TX headers) Specify exclude-data=1 to exclude the data
…ests neo-arweave has a roundrobin scheme where it will try several nodes looking for a chunk. arweave.net delegates to a single node regardless of whether or not it has the chunk - this can yield unreliable results (same query sometimes returns data sometimes 404s)
c1a32eb to
bfce13b
Compare
src/dev_copycat_arweave.erl
Outdated
| %% it). | ||
| TestStore = hb_test_utils:test_store(), | ||
| StoreOpts = #{ <<"index-store">> => [TestStore] }, | ||
| Store = [ |
There was a problem hiding this comment.
Is there a better way to have a test use a test store for all stores? If I don't do this, the test will use the default (mainnet) store some of the time and the test store other times which breaks the test.
| <<"node">> => | ||
| #{ | ||
| <<"match">> => <<"^/arweave">>, | ||
| <<"with">> => <<"https://neo-arweave.zephyrdev.xyz">>, |
There was a problem hiding this comment.
Route GET /chunk to neo-arweave for now as it is more reliable for this specific endpoint.
| % TODO: | ||
| % - should this return composite for any index L1 bundles? | ||
| % - if so, I guess we need to implement list/2? | ||
| % - for now we don't index nested bundle children, but once we | ||
| % do we may nalso need to return composite for them. |
There was a problem hiding this comment.
Calling this TODO out. Not sure if some of this must be addressed before we merge or whether it can all wait for a future PR?
There was a problem hiding this comment.
Sorry, I was focused on my work, and I didn't look into this until now.
composite is used as a definition for a folder. The information (content type, data, etc) is under a signature, which is a folder (composite). I think read provides everything we need to read, so we don't need to define composite.
Main change was implementing hb_store_arweave:type/2
353a1eb to
73b4fad
Compare
…indexed - Old behavior: Count was exclusive and would keep going if `from` was less than `to`. e.g. `from=1000001&to=1000000` will index only block `1000001`, `from=999999&to=1000000` will index all blocks 999999 and lower. - New behavior: Count is inclusive and stops when `from` is less than `to`. e.g. from=1000001&to=1000000` will index blocks `1000001` and `1000000`, `from=999999&to=1000000` will index no blocks.
| fetch_blocks(Req, Current, To, _Opts) when Current < To -> | ||
| ?event(copycat_arweave, | ||
| {arweave_block_indexing_completed, | ||
| {reached_target, Current}, | ||
| {reached_target, To}, | ||
| {initial_request, Req} | ||
| } | ||
| ), | ||
| {ok, Current}; | ||
| {ok, To}; | ||
| fetch_blocks(Req, Current, To, Opts) -> | ||
| BlockRes = | ||
| hb_ao:resolve( | ||
| << | ||
| ?ARWEAVE_DEVICE/binary, | ||
| "/block=", | ||
| (hb_util:bin(Current))/binary | ||
| >>, | ||
| Opts | ||
| ), | ||
| process_block(BlockRes, Req, Current, To, Opts), | ||
| fetch_blocks(Req, Current - 1, To, Opts). | ||
|
|
There was a problem hiding this comment.
- Old behavior: Count was exclusive and would keep going if from was less than to. e.g.
from=1000001&to=1000000will index only block1000001,from=999999&to=1000000will index all blocks999999and lower. - New behavior: Count is inclusive and stops when from is less than to. e.g.
from=1000001&to=1000000will index blocks1000001, and1000000,from=999999&to=1000000will index no blocks.
Rarely we find non-4096 bit RSA-signed transactions in the blockchain
f8617de to
7ba7275
Compare
An example config.json:
```
{
"arweave_index_ids": true,
"arweave_index_store": {
"index-store": [
{
"store-module": "hb_store_lmdb",
"name": "cache-mainnet/lmdb",
"ao-types": "store-module=\"atom\""
}
]
}
}
```
enable `copycat_perf` to see metrics logged also written to `copycat_perf.csv`
configure with arweave_index_workers
There is a function to convert from string to atom ( |
we need all transactions in order to ensure we can build the correct offsts for any of them
If an L1 or ans104 transaction includes any tags which have a name clash with built-in fields (e.g. anchor, target, data), the tags will be preserved as original-tags. For fields other than `data` the tag values may be promoted to top-level message keys if the built-in field has a default value (i.e. only promoted if there's no value clash) A `data` tag will never be promoted to a top-level message key, but it will be preserved via original-tags. Note: in some situations this can create some redundant data. E.g. a tag may be preseved in original-tags *and* as a top-level message key. This redundancy already occurs in other situations, though, so is assumed to be acceptable.
Add support for indexing all transactions and bundled ans104 data items in a block. Index maps the tx or item ID to an offset in the weave. When loading the tx or item,
hb_store_arweavewill query the range of weave data from the configured chunk node and deserialize it.New options:
arweave_index_ids: whentruedev_copycat_arweavewill index the transactions and ans104 items in a blockarweave_index_store: configure the store to use for maintaining the indexroutes => #{ <<"template">> => <<"/chunk">> }: configure the the gateway to use forGET /chunkrequests.Index format:
<<"ID">> -> <<"IsTX:Offset:Length">>Questions/Notes
~copycat@1.0: I updated how it iterates through the range of blocks to be indexed. Let me know I should revert.fromwas less thanto. e.g.from=1000001&to=1000000will index only block1000001,from=999999&to=1000000will index all blocks 999999 and lower.fromis less thanto. e.g. from=1000001&to=1000000will index blocks1000001and1000000,from=999999&to=1000000` will index no blocks.hb_ao:resolvevs.hb_ao:get. This PR primarily useshb_ao:resolveand only useshb_ao:getwhen querying a key from a map.arweave_index_store) vs. binaries (e.g.<<"arweave-index-store">>)? I tried to mimic the conventions already in use.<<"exclude-data">>arg todev_arweaveto allow it to query only the TX header without also downloading the data. I had initially omitted the flag and just forced the data download to be a separate operation, but this created some complexity around the overlap between L2 and L1 IDs. An L2 ID always maps to the full data item, but an L2 ID would only map to the TX header and then the client would have to do a secondresolveto get the data payload. Current approach keeps legacy behavior the same (both L2 and L1 IDs map to the full payload), with the option of only querying the TX header where needed.data_root. This computation depends on how the serializeddatawas "chunked". Unfortunately this information is not currently preserved in HB messages. The majority of transactions likely follow the arweave-js chunking scheme. This PR implements that chunking scheme as the default. In the future we may need to either track chunk boundaries (e.g. as commitment fields), or support multiple chunking schemes (and track those as commitment fields)dev_arweavequeries the gateway's/chunkendpoint it assumes the gateway is running a recent commit from thearweaverepo (4de096e20028df01f61002620bd7d39297064a5b). This commit has not yet (as of Jan 25, 2026) been included in any formal arweave releases.