All Blockchains are actually Event Sourcing systems.
They all work with streams of transactions coming from different sources into limited number of actual chains.
Unfortunately no available implementations (as of now) actually consider Blockchains as streaming systems, nor try to turn this natural feature of Blockchains into a business benefit.
The concept of Sidechains has arisen from the general understanding of the fact that Bitcoin doesn’t fit it all use cases, in particular it cannot stretch indefinitely to satisfy all very different business scenarios.
One- and two-way pegs were introduced to let custom chains to communicate with Bitcoin and each other.
For the enterprise environment, the ability to physically split information between different chains whilst permitting referencing between them is extremely attractive, not least for security and regulatory reasons (e.g. “Chinese Walls”, geographical restrictions, etc.).
Anonymity and full traceability are a trade-off. Anonymity can create complications with KYC (and other regulations) whilst be desirable, indeed necessary for others (e.g. Swiss banking). From this perspective, anonymity mechanisms originally proposed in the Bitcoin community show promise in cases of controlled use in variety of environments.
Modern cryptography can now require cryptosystems requiring unavoidable collaboration of a specific number of separate parties (e.g. 3 of 12 company directors) to confirm the decision (or access to some information).
All the following operations in an Enterprise level system are expected to be role-based (and no existing implementation of Blockchains currently provide this).
A user recently joined the organization should be able to get access to the system (and be able to get recognized by the system).
They should lose this access on leaving the organization.
Depending on their role(s) within the organization they should have access to different parts of system’s functionality. The set of functionality accessible should follow changes in the set of roles assigned to the user and their authorizations.
Ability to encrypt and decrypt some data based on requirements of a role covers a variety of data protection use-cases starting from simple “user lost his key” to more complex cases like “information available only for company Directors”.
Blockchains naturally allow creation and validation of cryptographic proof of facts.
A legitimate expectation of any database user (for decades) is an ability to atomically validate a transactions input or rejection of the transaction if an input has a chance to put the system into a logically inconsistent state. Users need to be provided with ways of not only validating but also actively reacting to inputs and state changes.
Another legitimate expectation is an ability to move the code, or the logic, as close to the data as possible instead of moving data to the code. This ability in itself can create a huge performance improvement, but if combined with the option of saving the result back to the database with full audit, performance improvements are dramatically enhanced.
Yes, we are talking about Triggers and Stored Procedures.
In Blockchain world these are addressed in two ways:
- by hardcoding processing logic (e.g. payment transaction processing in Bitcoin-like networks), or
- by adding scripting option to the protocol (rather functionally limited op_codes in Bitcoin and full Turing-complete languages in platforms like Ethereum and Codius).
There are of course legitimate reasons for using any of these options but an important point is that the enterprise level Blockchain systems that don’t provide convenient and user-friendly triggers and stored procedure functionality will be at a massive disadvantage compared to any (distributed) database or calculation framework.
(to be continued ...)
When talking about Blockchain in its original Bitcoin sense many authors using term “Distributed Database”. This is a term misuse because of the transactions data is replicated (copied) to every participated node rather than distributed between some of them. So the term “Replicated Database” is more relevant.
“Replicated Database” means the Data Storage requirements grow at least linearly (and network traffic grow super-linearly) together with increasing amount of Validation Nodes regardless of actual feasibility of such a heavy replication. Effectively this doesn’t allow effective data partitionability (sharding). (Google for “CAP theorem” for more details.)
Another consideration is the fact that modern Cloud and other infrastructure providers supplying customers with data storage solutions which are already powered by hardware data replication.
Let’s look into the scalability a bit more closely.
Bitcoin has a known limitation of 3 transactions per second. Which is a laughable amount these days and more I talk to (ex-)banking people more I hear the same concern.
It similarly applies to block confirmation time. Whether it is average 1/10 mins (Bitcoin), 1/12s (Etherium) or 1/5s-1/1s (some other providers), it doesn't matter as it is still toooo slow. Most Ecommerce solutions would kill this on their own. I need 1 million per second or 1 per micro second?
Generally, "n-transactions per second" is slightly a cheeky term as it actually represents a two-dimensional thing: throughput vs latency. To some extend it can be rephrased as horizontal vs vertical scaling.
A frank example - the same reasonably big amount of something can be moved from A to B using Formula1 cars and freight trains. F1 is about low latency/small throughput, freight train - high latency/huge throughput.
So when we are addressing throughput (and calculations we run are reasonably parallel-able) we potentially can split calculation into independent realms (partionions, shards) and execute them on separate chains. For whatever target number you just add more chains, machines, and set cascading aggregations and here you go.
If we are addressing latency the situation is slightly more interesting as every single hashing or network operation (or persistence) is pumping this latency up and the question is what we are sacrificing: consistency, business risk, availability, immutability and so on.
Since milliseconds are no longer a "true HFT" but even for a garbage collected language you still can get a magnitude of tens of millions per sec per thread on relatively simple operations.
Additionally, the initial stream of transactions could be partitioned into a number of parallel chains to simultaneously increase the throughput.
Vast majority of existing Blockchain implementations supply users with a convenient way of writing data into the system (putting it into a replicated transaction log or distributed ledger). But with reading and especially querying data situation is way less straight forward.
Users presented with the current state of the system (in a case of Bitcoin via balances). But to find out how some specific account used to look like when say the difference between two other accounts was minimal and the fourth one had at least 100 BTC requires writing very non-trivial data processing logic. Additionally it may require to traverse through enormous amount of data as there is no guarantee which blocks contain transactions related to accounts in question.
If we put payments aside and try to create a more general purpose Blockchain based solution then we quickly realize that standard SQL databases simply not designed to work efficiently with data represented as a stream of facts in perfect tense (transactions, events, etc.) rather than in continues tense (balances, final states).
1. Centralization vs decentralization
1.1. Disaster Recovery
The first question any responsible manager usually asks before approving a new IT system for production use is “who shall I call if things go terribly wrong?”
In a case of a centralized system the answer is very obvious – call the 1st/2nd/3rd level support, wake them up if necessary, and let them deal with the issue.
In a case of a fully decentralized system this option is not easily available as decentralization assumes different parts of the system to be controlled by different entities from support perspective with unavoidably required coordination between them for system integrity level actions.
From this perspective even Bitcoin is not a fully decentralized system. At the end of the day it is one network.
Same logic applies to other standard operations jobs.
Software releases should follow established procedures.
Users must have dedicated points of contact.
Decentralized validation assumes decentralized data storage.
For a case of a large enough organization this require separate Database provisioning and management every node or cluster of distribution. With all the associated burden.
Direct use of a separate Proof of Work implementation for an in-house system even in a global multinational organization looks like a massive overhead with quite limited business benefits.
That is actually one of the reasons why many Proof of Stake projects has arisen recently. Most of them are way less computationally intensive comparing to PoW based ones but the fact of their variety by itself tells us that none of them solves issues like “nothing to stake” completely. And we probably can expect to see more of “Proof of XXX” algorithms be announced in nearest future.
For more information, please read the Bytecoin article by Ray Patterson:
As an alternative a combination of PoW and PoS can be used (e.g. Proof of Activity - https://bytecoin.org/blog/proof-of-activity-proof-of-burn-proof-of-capacity/) or combination of the Bitcoin and Sidechains (please see Avalanchain Bitcoin nailing section for further details).
Existing Blockchain implementations select network level consensus mechanisms with common parameters for every application, user and scenario.
In reality this fact creates quite interesting situation when blockchain developers effectively making decisions about how End User’s business should operate.
On these grounds it looks like a reasonable expectation for an Enterprise level Blockchain system which is expected to survive for years to have Consensus mechanism selected for it pluggable or swappable depending on actual business needs and scenarios.
Let’s be honest, within a real organization it is quite difficult to find a rationale to make one department to pay another one some amount of virtual currency for validating their transactions.
Such applications are naturally limited to currency exchange/trading, and different types of order entry applications.
More commonly used patterns usually rather involve renting of real or virtual machines with time-based billing in real department budget currency.
Bitcoin, Etherium and pretty much all other ones, every operation requires some amount of home-brew virtual currency get attached to that operation in order to let it proceed (and intensify people owning processing/validating machines, also allowing priority execution). The problem is in deciding how this virtual currency in distributed among departments in the first place and why people owning the machines should care about getting any amount of it. So effectively it's a way of creating parallel internal economy aside of regular budgeting.
An alternative approach is to use normal budgeting currency and normal internal accountancy (in say USDs) for the resources charging.
All at least make fees optional and/or pluggable.