Why not shard the bakers?
relating to sharding a chain of blocks, tezos specific blocks appear to be relatively linear and shardable.
there are two hurdles to spend tezos received, typically c number of confirmations, like 30, and 1:1 block hierarchy to run all blocks in one dimension (x being a 1 dimensional depth axis hereafter) .
Why not add a y axis for shards to bake along multiple hash radix?
Just as if we were sharding a key-value store this would enable a block to confirm against a much larger spread of bakers, along a progression of nodes to arrive at the one-true ledger (an x depth dimension).
We want to divide and conquer the block baking and endorsement as if merging multiple chains processed by a sharding force-multiplier to continue delivering the facade of a blockchain based on depth (x index).
The result would function and comply with the existing blockchain methods and algorithms as a single chain of blocks having a single x index, while the delegation of processing baking and endorsements would occur along boundaries of key value radixes, proportionate to the pool of active bakers (a heuristic rolling average), a y axis, which would presumably be determined as a mem-pool tweak. The baking and endorsement should accommodate a minimum of 2 shards, and should likewise use local and provable key values to maximize the variance of which 2+ chains are tasked between any pool of bakers.
So we are discussing how to fairly bake all the transactions of depth=x by dividing the load across a y coefficient width of bakers. Then we distribute all the work to the existing topology that exists today, effectively dropping the y axis once a block is complete by the mempool’s exposure to the processing agenda knowledge that should exist identically in the connected nodes.
Non-zero load balancing fairness:
We can draw from the cuckoo hash (Cuckoo hashing - Wikipedia) semantics of creating a y buckets for baking nodes that fall into hashing buckets; and when a given hash bucket is full the cuckoo hash algorithm will designate an alternative bucket, and so on, as a reasonable and performant distribution of balancing work more or less fairly.
Transparent baking-only load balancing factors:
The goal of this is to spark discussion illuminating why or why not the possibility exists to operate on every transactions purely on the basis of prior black transactions published and that there are no validation failure conditions based on sibling transactions in the same depth. Once a shard is designated in the mempool and processing nodes are determined, perform work, and publish one or more blocks with x and y correspondence as if we were sharding a directory tree or a nosql key-value store ( Shard (database architecture) - Wikipedia ) we want to deliver the now smaller blocks and reconstruct the chain as previously depth-only one dimension chain, for all intents and purposes the correct choice of shard determination should introduce no divergence from the contract and blockchain processing semantics in play of any given revision.
We can assume the baking node agenda is known in advance as is presently done with tezos (Edo), but the transactions and keys contained therein are relatively difficult to predict.
Minimum mean shard size
Determined by the rolling average of bakers and reasonable minimum overlap to regulate the bucket depth of the cuckoo hash mentioned above. This is a feature of mempool performed with the queue on hand. This is to reduce attack surface of exploiting empty waiting bakers by spoofed seed values or the engineering of grossly uneven queue depths to desync the network.
Impractical Long Range Attacks on Shards:
Shard delegation would correspond to: Baker’s Node id * Baker’s Published Key * (Transaction hash * n ) * (hash of x + block(x-1…5).merkle)
Redistribution of transactions on contested blocks
We would expect that rewinding chain transactions and any change in number of transactions will create completely unrelated y correspondances to baking nodes from the contested blocks. This should be well afforded by the force multiplier effect of using a greater number of processing homes to verify smaller chunks and arriving at earlier completion, amortized.
Towards the evolution of more scalable smart contracts:
We assume that a successful network effect needs at a minimum an effective and self-organizing heuristic of load balancing and handling usage spikes. We should assume that gas price bidding wars is an unfavorable outcome which can be addressed by the acceptance of the tezos utility, enticing an uptick in the knowledgeable Baking community, and virtuous cycles all around from a simple mechanic of force multiplying and reducing the intervals between baking and endorsing as needed.
Luck: to what degree would this change the baking luck, payouts, and staking requirements? I’m the least qualified individual I know to comment on this matter.
With the help of the Tezos Discord I was pointed in the direction of some relevant points about the state of sharding, the Mempools’s present and future expectations, and its ideal situation as a block delegator and load balancer agent.
I make no claim about knowing the specifics of the OCAML codebase or having experience with the topology, my background is from database and nosql architecture and performance tuning. I would greatly appreciate additional topical pointers to continuing refining and improving my understandings (this is representative of what we’re doing, but adding a few safety features to avoid bad actors The Dynamo Paper | DynamoDB, explained. )
(apologies for mixing rich-text with markdown in advance)