Grant Update August 2019

Technical Updates | September 24, 2019
Home » Company » Blog » Grant Update August 2019

Status and Achievements

August was a month with mixed success on the factomd core development front.  It saw an unsuccessful deployment of the Parchment release, and a difficult-to-debug network pause, but much progress was made towards the Xuan release.

There were 2 sponsor meetings in August 2019.  A new sponsor for the Protocol Development Grant joined in August. David Kuiper of Bedrock Solutions has taken over from Domonic Luxford (88mph) who had recently resigned.  The sponsors in attendance in August were Nolan Bauer, David Kuiper, Valentin Ganev, and Nikola Nikolov (both of Factomatic). There were two sponsor meetings and one of them was recorded.

The August 28th sponsor meeting is available here:  https://youtu.be/kl6d7aHpJXQ

The monthly Jira and Github reports are also available:

Jira Report: /drive.google.com/open?id=17Bw_NRcGtKa-n3Gnmhg2Deo582d9OKRN

Github Report: https://drive.google.com/open?id=1IfVxjWSNYouwuybe-42Vvh63An4rwbrB

August began with limited testing by Authority Node servers of the Parchment release (v6.3.3).  It had passed all the testing on the Community Testnet, but exhibited some unstable behavior. It was ultimately not pushed out for general release. 

August also concluded the grant round 2019-3, which needed a new factomd release to activate.  The version predominantly running on the network servers was v6.3.2 (Bond), and the grants were built on that version.  They were released as v6.4.0 (codenamed Origami). This was the first release where the Factomize forum generated code that represented the grant winners.  There were still testing procedures that were exercised, but this new procedure will simplify grant releases going forward too. 

Unfortunately during the rollupt of the 6.4.0 code onto the Federated Servers, there was an operational mistake by one of the node maintainers.  It resulted in the network coming to a pause as part of the failsafe design. Here is a detailed 5 page after-action report on the network pause: https://drive.google.com/open?id=1gHtb2AvdrPU8j8dmMFAaiS8eTxmyqMKU

As part of recovering from the pause, one of the bugfixes from parchment was applied to the 6.4.0 release.  This release 6.4.1 (codenamed Cotton) was successfully deployed across the network and was successful in getting transactions to be processed again.

Despite the setbacks, there was significant forward progress with factom core development in August.  

One problem that was being expressed in Parchment was that nodes would sometimes not follow along closely with the blockchain as it was being built (following by minutes).  This is critical for the Federated servers, as they cannot download the blocks later, and instead need to collaboratively build them over the 10 minute block period. The root cause of this was found to be a race condition with the timestamp generation.  This was fixed in FD-1155 and is pending for release in Xuan. https://github.com/FactomProject/factomd/pull/835/files

Another important bug that was solved was a problem with two elections that happened during the same minute in the same block.  This is a big stability fix, that will help the network recover far better during network disruptions. This was fixed in FD-1113 and is pending for release in Xuan. https://github.com/FactomProject/factomd/pull/800/files

In addition to the good news about the bugfixes, August continued to have contributions from community developers.  For example, Michael Lam had submitted a couple pull requests:

https://github.com/FactomProject/factomd/pull/830/files

https://github.com/FactomProject/factomd/pull/804/files

In addition to attending the Sponsor meetings, David Kuiper has been attending the weekly core developer standups.  This will help in his assessment of how good of a job Factom, Inc has done over the grant period.

Another big contribution from community developers was an update to the API handler.  It now is based on a library which is maintained. https://github.com/FactomProject/factomd/pull/801/files  The change affected how port forwarding is handled by factomd.

For example, if a factomd node is behind a port forward, the calling application might need to add more info to how it asks for info from factomd.

With a call that use to work like this:

curl  –data-binary ‘ {“jsonrpc”: “2.0”, “id”: “0”, “method”: “network-info”, “params”: {} }’ -H ‘content-type:text/plain;’ http://127.0.0.1:8000/debug

It would now take a call like this:

curl –header ‘Host: 127.0.0.1:8088’ –data-binary ‘ {“jsonrpc”: “2.0”, “id”: “0”, “method”: “network-info”, “params”: {} }’ -H ‘content-type:text/plain;’ http://127.0.0.1:8000/debug

Notice the new –header ‘Host :…’ parameter.

Factom, Inc also continues to support the Livefeed API which Sphereon is implementing.  There was a design review in late August with 8 Factom, Inc members to help with the evaluation.

All this is shaping up to be the most advanced and reliable factomd release.  Xuan has many bugfixes and stability improvements that have been hard to find and implement, but will be worth the effort.

Completed

New capabilities and maintenance issues that were released

  • FD-1160
    • Activate grants for grant round 2019-3
    • encode the results of the grant 2019-3 (held August 2019) into the blockchain
  • FD-1169
    • Allow DBsigs that are created before boot time to be valid
    • When leader nodes are booted near the same time, allow the ones booted later to recognize the ones booted earlier
  • FD-1124
    • First dbstate on boot is processed twice
    • Stop double processing blocks.

 

Awareness

Blockchain in Healthcare – Showcase and Panel

In Progress

  • FD-1203
    • Maintain proper coinbase behavior
    • Xuan needs to maintain same behavior as previous release for coinbase transactions
  • FD-1201
    • Xuan dbstates sync off-by-one bug
    • Stop downloading blocks that a node already has locally
  • FD-1199
    • Expire stale messages from Holding
    • factomd could accumulate messages in holding and slow down over time, eventually crash by running out of memory
  • FD-1192
    • Avoid race condition with DBsigs and setting up for Factoid transactions
    • Avoid panic when booting node
  • FD-1190
    • Drop messages if outbound messages queue is full outside of wait period
    • Protect against a possible deadlock scenario under certain situations – where (for example) we are booting while the network is under load.
  • FD-1187
    • Remove Remnants of ELECTION_NO_SORT activation.
    • Remove unused code to make it more legible
  • FD-1186
    • Update EntrySyncing to get rid of unnecessary complexity
    • Slightly simplify the 2nd pass downloading code to increase reliability
  • FD-1183
    • Add more logging around process list debugging + scripts for analysis
    • Allows core developers more insight over the system internals
  • FD-1182
    • With 2 elections in a minute, correctly compute the appropriate VM
    • Fixes serious bug where after a one election, audit servers are incapable of replacing a second federated server
  • FD-1181
    • Missing Message Processing is losing requests
    • Avoid deadlocks when following along with the blockchain when local messages are missing
  • FD-1180
    • Never Hold non-ACK’d messages
    • Avoid nodes getting clogged up with missing message responses if they can’t be used
  • FD-1177
    • investigate inconsistent responses from debug API
    • Debug API should return data from behind a port forward
  • FD-1176
    • Pokemon bug detected in Validate()
    • Protect against a newly detected form of the Pokemon bug to prevent random panics
  • FD-1174
    • Avoid a race condition on boot when RestoreFactomdState is set
    • Avoid corrupting the local state which would cause balances to be incorrectly calculated
  • FD-1172
    • Adjust message propagation filter to be set/reset by heartbeats
    • This allows followers to resume propagating network messages when a hour long pause occurs. This will help recover from a future pause.
  • FD-1171
    • Update savestate to 13 so that fixes for FD-1091 (blocks divisible by 1000) are captured
    • Protects against the risk of nodes being stopped on a block divisible by 1000 and becoming corrupted.
  • FD-1162
    • Repair Dependent Holding to use highest block instead of leader height
    • Allow the blockchain to be downloaded more reliably in a stochastic or hostile network environment
  • FD-1155
    • Followers take many blocks to start following minutes after boot
    • Improved followers and audit servers ability to stay in real time with the blockchain. This helps audit servers be able to take over. It allows followers to know the latest transactions on blockchain.
  • FD-1154
    • Increase Entry Sync retry rate
    • Reduce the amount of time it takes a node to catch up with the second pass blockchain download
  • FD-1146
    • Community Contribution – Factable solutions code changes
    • Comment and simplify code to make it more legible to future developers
  • FD-1138
    • New SimTest to test elections in every consecutive minute.
    • Create a test which can be run repeatably that runs many elections
  • FD-1137
    • test and allow golang 1.13
    • let developers build using the latest version of golang
  • FD-1136
    • EOM for Minute 10
    • Increase stability under load when a timing bug can cause factomd to get confused and degrade performance.
  • FD-1129
    • Correctly handle Rejected Messages
    • Fixes potential bug when recovering from election caused by possible attack vector or bugs.
  • FD-1126
    • Add a way to alert developers when Nightly CI build fails
    • Adding a visible alert to the automation pipeline will help maintain high standards of quality during the development life-cycle
  • FD-1114
    • Community Contribution – Factable solutions code cleanup
    • Code cleaning and commenting
  • FD-1113
    • Election Testing Failed
    • Sequential elections within the same block should elect the correct unresponsive vm
  • FD-1105
    • Make FollowerExecuteACK also look for messages in Dependent Holding
    • Fix a bug that hindered the performance improvements of Dependent Holding
  • FD-1104
    • Put missing message responses into their own thread
    • Increase reliability and prepare for refactoring by fully isolating missing messages and their responses from state access. Rely only on channels for communication
  • FD-1089
    • Extend SetupSim
    • Helpful for create databases for DevTestNet and removed “special/annoying”ness of FNode0 in tests.
  • FD-1041
    • Review holding map in order of arrival
    • Increase network capacity by decreasing randomness when lots of traffic is arriving over the p2p network
  • FD-1038
    • Community Contribution – Fix loading of custom cert paths
    • Actually use the TLS cert that is specified in the config file.
  • FD-932
    • All tests should be run against Factomd Releases
    • For each release we need a way to automatically test using all Simulation Tests. Currently this is a somewhat manual effort – automating this makes it easier to track history and expose visibility into the release process.
  • FD-924
    • Community Contribution – Docker dev environment misconfigured
    • Allow simulated development nodes to communicate in an isolated docker based environment.
  • FD-898
    • Update circle build environment to golang 1.12
    • Build with the latest circleCI/golang environments to maintain forward compatibility
  • FD-827
    • fast track chain commit/reveals
    • Prioritize Chain creation over Entry creation to help prevent backed up queues under load
  • FD-745
    • Test double application of factoid transactions when booted in minute 9
    • Test for regressions of a prior dangerous fault where factoid transactions are double applied, calculating the wrong balance
  • FD-450
    • Community Contribution – Removing Hoisie Web
    • Remove dependency on an old un-maintained library and use a more modern one instead for the API network handlers

Future Plans

Protocol Grant

  • Ongoing Performance Improvements
  • Ongoing Maintenance corrections
  • Continue supporting the protocol and issues that arise

Oracle Grant

  • Ongoing Maintenance corrections

Anchor Grant

  • Ongoing Maintenance corrections

Awareness

POSTED: September 24, 2019 BY Crystal Wiese IN Technical Updates
ABOUT THE AUTHOR

Crystal Wiese is the Director of Marketing at Factom. Working within the startup tech scene for the past 10 years she has become passionate about taking great ideas and building a strong comprehensive narrative. Graduating from the Art Institute of Portland with a degree in Design Management, Crystal has always seen the value in creating strong ties between the technical, creative and business.