Outputs and Outcomes
Our focus is to:
- Improve the performance and reliability of the Protocol
- Monitor the exchange rate and make adjustments as needed
- Maintain the Anchor Master Securing the Factom blockchain for the long term
Status and Achievements
June was a huge win for factomd development. A long anticipated release went smoothly and provided significant scale and stability increases for the network.
We are also happy to announce that June saw the successful conclusion of the Factom, Inc. Protocol Grant 012. One of the deliverables for the grant was a Refactor Plan report. This report gives insight over the medium term technical goals that Factom, Inc has for core development.
June began with a test on the Community Testnet of the Bond release candidate. A load test was run by the testnet crew. The network was able to process more entries per second than the single node was able to process through the API. The network leaders were continuing to build blocks throughout the 15 Entries per second test period. Community member Paul Bernier has written a report about the load testing here.
June had two sponsor meetings, one of which was recorded and the video was posted. The sponsors in attendance in June were Dominic Luxford, Nolan Bauer, Valentin Ganev, and Nikola Nikolov (both of Factomatic).
Also, June marked the first month that Steven Masely joined Factom, Inc as a degree holding computer science graduate. He had been helping as a coop student in the past, as well as some part time work, but he is now helping out full time.
The main event for factom protocol development in June was the successful roll out of the Bond release, v6.3.2. https://github.com/FactomProject/factomd/pull/736/files The upgrade was accpted by the community and was installed on servers around the network. This is a big deal because the Bond code represents many months of advancements to the codebase. Up until the v6.3.2 release, the bulk of factomd development was not benefiting the mainnet. The updated codebase did have an attempted deployment in April as v6.2.2. That deployment did not go well, and was pulled back. After uncovering several bugs in the codebase, Bond was installed by the authority set on the mainnet. The Bond release also dramatically reduced the memory required to run factomd. Previously, a 4GB machine would not be enough to boot and run factomd. Now 4GB of RAM is sufficient to run a follower under normal circumstances. (More RAM still does help when there are unusual circumstances which might arise.) The network is now protected against some security vulnerabilities that were fixed in late 2018. We can now talk about some of them.
We want to thank the cybersecurity firm Peckshild for providing a responsible disclosure for a vulnerability they found. It allowed a specially crafted message describing negative block sizes to crash factomd nodes. The network is now protected against this now that v6.3.2 has been deployed.
In Bond, the biggest bug that was found and fixed was the cause of the 6.2.2 pause. There was a situation where a leader could send out two acknowledgements for the same process list height. This would violate some of the consensus rules and the network would stop to protect against damage. Resolving this bug allowed was the last barrier to getting the multi-month accumulation of performance and stability advancements live on the network.
Another advancement with Bond was to rewrite the Entry syncing code, which downloaded in the 2nd pass. The rewrite of this section was an important precursor to sharding, and it also made the 2nd pass of downloading the blockchain more stable and faster.
There was also a vulnerability that was discovered on the testnet. In a normally operating system, if a node receives an invalid transaction, the transaction would be ignored. The bug that was found on the testnet was that invalid transactions would get repeated and spread around the network. This was an opportunity for denial of service. The bug has been fixed in Bond.
Another significant issue that the Bond release fixed was the End of Minute handling. The internal messages that a Federated server used to decide when to finish its section of the blockchain was getting lost internally. Creating a dedicated pathway for these messages increased stability.
June also showed the continuation of the Core Developer standups. In June, the average was about 2-3 non-Factom, Inc people that attended the weekly standup.
As a highlight of the community contributions, Sander from BIF had started delivering some improvements to the factomd code in June. There are several other examples of community contributions in June, and this process is getting more common.
The Factom, Inc – 012 grant had four deliverables that were all achieved. The deliverables were:
|1||Refactor (documentation)||Success||This is available here.|
|2||Maintenance||Success||There were tons of bugfixes, and the majority of the released items were maintenance items. The bulk of them were deployed in Bond.|
|3||Testing||Success||All the changes to factomd go through various batteries of testing as they progress towards the release process. This is long before it gets to the Testnet. Automated tests also doublecheck each check-in. (example and result)|
|4||Developer Support (limited)||Success||This grant period saw a large increase in the amount of non-Factom, Inc development that was happening. The weekly core developer standups were also started during this grant period.|
All 4 of the grant deliverables were successfully delivered for the Development grant ending in June. This report spans two grant periods. As such, both completed and work in progress are listed here.
New capabilities and maintenance issues that were released
- Resolve dual ACK bug where under load a leader could produce two ACK for the same height.
- Don’t pause the network when high traffic causes extraneous messages that break consensus and stop forward progress
- Rework Entry Syncing with channels and go routines
- Massively speed up and make more reliable the 2nd pass downloading the blockchain.
- Avoid short minutes when holding queue is backed up
- The work done as part of this ticket will help tune the efficiency for Leaders when the network is under load.
- Invalid Transactions should not be sent to the network
- Limit a node from rebroadcasting invalid transactions to the network.
- Keep Leaders from getting out of sync on minutes
- Allow the network to continue processing entries when Federated servers are booted at different times.
- Community Contribution – 2nd pass flagging issue on boot
- Remove confusion when starting factomd if the 2nd pass has actually completed when looking at the control panel.
- Logging saves only partial hashes but in some cases the whole hash is required. Add a separate log of all unique full hashes.
- Help with debugging curious messages on the network, print out the full entry hash to aid tracking down the problem
- EOMs from the future can be dropped in review holding
- Don’t delete messages that are likely to be useful prematurely
- remove code that tosses incoming messages based on holding
- Retain messages instead of deleting them shortly before needing them.
- FetchPaidFor can have a nil pointer exception if it encounters an unallocated process list
- Don’t panic during edge cases when starting factomd
- Fix an issue with the writing go routine for writing entries
- Speed up saving the blockchain on slower machines
- Make message sort and process/update state be in separate threads.
- Reduced deadlock potential (pause) and smooths out message processing.
- Reveals must wait on their Commits
- Handle messages in the correct order to prevent seizing up of a node
- Send Out all messages in the holding queue at a proper rate
- Allow messages that are known about to propagate through the network under edge cases
- Off by one error in DBsigs removed from holding queue
- Allow factomd start up more easily under load
- Validate reveals before rebroadcast
- Ensures that reveals are validated before they are sent to peers, improves inefficient code that doesn’t improve performance
- Remove simulator load creation from it’s own thread
- Allows for a higher load in simulation to stress the system more.
- Accounting Blockchain Conference, June 5th, New York, NY
- Tech Titans’ Blockchain / IoT Forum, June 11th, Dallas, Tx
- Central Texas Consultants Network, June 26th, Austin, TX
New capabilities and maintenance issues that are pending
- Inefficient DBState loading while syncing from the network
- Better handle downloading the first pass of getting the blockchain
- Make holding management be dependency driven instead of scan driven.
- Change holding queue to understand orders of operation to increase performance as well as to prepare for sharding
- Optimistic Entry Writes
- Distributes the database writes across block time to avoid pauses in processing on block boundaries.
- Downloads dbstates in batches when syncing from the network
- Increases efficiency when syncing from the network
- Refactor sim testing
- This ticket rolls up past work into a the latest revision. This enables new types of tests to be written that include adding a node during simulation and also the ability to filter out specific messages during testing. Additionally this will run simulation tests that previously did not execute on Circle.ci
- Factomd Authority JSON marshalling should be usable
- correctly export server identity coinbase address and efficiency fields in JSON unmarshalling.
- repair balance checking tool
- Corrects Issue with balance checking tool to allow use tool to check if database is corrupted
- Refine some of the unit test code
- Add unit tests for local commits and local wallet simulations
- Add more simulation testing scenarios
- Adding more extensive testing for other brain-swap scenarios will help catch more types of backward-incompatible changes from making it into the codebase.
- Community Contribution – Brainswap an Audit Server
- Don’t panic when brain swapping in an Audit server.
- Community Contribution – close local TCP handler when connection drops
- Better handle error conditions with transient p2p connections
- Make Tests for FilterAPI
- Make sure that regression tests can be run without race conditions
- Community Contribution – add if booted from disk to diagnostics API
- allow outside programs to know when the 1st pass has been fully processed from disk.
- Split receipts API into receipts and anchors
- With ethereum anchoring enabled, we need a way for developers to ask factomd which anchors covered a given transaction.
- Add ability to set a path for the log file output by prepending it to the debugregex.
- Increase testing ability by specifying where logfiles can be collected
- fix a pokemon instance involving MessageBase
- Catch a new style of Pokemon bug found on the testnet with MessageBase
- Community Contribution – Use different muxes for various web services
- Don’t overlap the various web services factomd provides
- It looks like a node is asking for DBStates it clearly has.
- Improve downloading of the blockchain
- Ack holding
- Fixes bug that could cause useless elections immediately after boot.
- Move Filter API to Debug
- Limit scope of testing tools on the API
- Missing Message Requests are tossed if the ask has an all 0 ID.
- Don’t ignore peers who added the identity of all zeros to their config file.
- Null pointer exception is possible checking payments for commits
- Don’t panic when timing issues create a null process list
- Community Contribution – Add configuration ability to the peer connection limit
- Allow users to increase or decrease the number of peers they connect to.
- Improve scripts for dealing with logs from QA
- Allow better debugging when QA discovers problems
- DBStateCatchupList is not thread safe
- Don’t suffer a race condition when downloading the blockchain
- Parchment + Bond Merge Fix Unit Tests + Minor Bugfixes
- allow further development using unit tests that had failed in the merge of Bond and Parchment
- Reposting eoms that never get executed builds up [EOM Poison]
- This is mainly exercised in unit tests with shorted block times than mainnet and would resolve some automated testing failures.
- Fix circle ci simtests for factomd
- More consistent testing results when using CircleCI
- panic on FD-1053_bond_parchment_merge when follower is connected to mainnet
- Allow factomd to load from the database without a race condition
- Community Contribution – Runtimelog doesn’t respect its own enabled setting
- Performance optimization when not debugging
- DBStateCatchup getNextConsecutiveMissing has off by 1 error
- Prevent asking for dbstates beyond the highest known block
- Dependant holding items whose dependencies are met by the content of a DBState are not released
- don’t leak memory with Dependent Holding upgrades
- Make the version and git version set-able from goland
- allow goland debugger to show git commit hash and golang version in factomd and the control panel
- Community Contribution – CrossBoot replay garbage collection never ends
- Llet Garbage collector use less resources with cross boot replay filter
- inefficiency with falling time.now() multiple times
- use less CPU when handling p2p peers
- Community Contribution – finish Election Sync fix
- more effectively clean up sync message handling
- Community Contribution – Legibility Improvements Part 1
- Factomd code should be more legible when possible
- Ongoing Performance Improvements
- Ongoing Maintenance corrections
- Continue supporting the protocol and issues that arise
- Ongoing Maintenance corrections
- Ongoing Maintenance corrections