First steps with Ethereum Private Networks and Smart Contracts on Ubuntu 16.04

Ethereum is still in that “move fast and break things” phase. The docs for contracts[1] are very out of date, even the docs for mining have some out of date content in[2].

I wanted a very simple guide to setting up a small private network for testing smart contracts. I couldn’t find a simple one that worked. After much trial and error and digging around on Stackexchange, see below the steps I eventually settled on to get things working with a minimum of copy/paste. Hope it will prove useful for other noobs out there and that more experienced people might help clear things up that I have misunderstood.

I’ll do 3 things:

  1. Set up my first node and do some mining on it
  2. Add a very simple contract
  3. Add a second node to the network

First make sure you have installed ethereum (geth) and solc, for example with:

sudo apt-get install software-properties-common
sudo add-apt-repository -y ppa:ethereum/ethereum
sudo apt-get update
sudo apt-get install ethereum solc

Set up first node and do some mining on it

Create a genesis file – the below is about as simple as I could find. Save it as genesis.json in your working directory.

{
  "config": {
    "chainId": 1907,
    "homesteadBlock": 0,
    "eip155Block": 0,
    "eip158Block": 0
  },
  "difficulty": "40",
  "gasLimit": "2100000",
  "alloc": {}
}

Make a new data directory for the first node and set up two accounts to be used by that node. (Obviously your addresses will differ to the examples you see below).

ethuser@host01:~$ mkdir node1
ethuser@host01:~$ geth --datadir node1 account new
WARN [07-19|14:16:22] No etherbase set and no accounts found as default 
Your new account is locked with a password. Please give a password. Do not forget this password.
Passphrase: 
Repeat passphrase: 
Address: {f74afb1facd5eb2dd69feb589213c12be9b38177}
ethuser@host01:~$ geth --datadir node1 account new
Your new account is locked with a password. Please give a password. Do not forget this password.
Passphrase: 
Repeat passphrase: 
Address: {f0a3cf66cc2806a1e9626e11e5324360ee97f968}

Choose a networkid for your private network and initiate the first node:

ethuser@host01:~$ geth --datadir node1 --networkid 98765 init genesis.json
INFO [07-19|14:21:44] Allocated cache and file handles         database=/home/ethuser/node1/geth/chaindata cache=16 handles=16
....
INFO [07-19|14:21:44] Successfully wrote genesis state         database=lightchaindata                          hash=dd3f8d…707d0d

Now launch a geth console

ethuser@host01:~$ geth --datadir node1 --networkid 98765 console
INFO [07-19|14:22:42] Starting peer-to-peer node               instance=Geth/v1.6.7-stable-ab5646c5/linux-amd64/go1.8.1
...
 datadir: /home/ethuser/node1
 modules: admin:1.0 debug:1.0 eth:1.0 miner:1.0 net:1.0 personal:1.0 rpc:1.0 txpool:1.0 web3:1.0

> 

The first account you created is set as eth.coinbase. This will earn ether through mining. It does not have any ether yet[3], so we need to mine some blocks:

> eth.coinbase
"0xf74afb1facd5eb2dd69feb589213c12be9b38177"
> eth.getBalance(eth.coinbase)
0
> miner.start(1)

First time you run this it will create the DAG. This will take some time. Once the DAG is completed, leave the miner running for a while until it mines a few blocks. When you are ready to stop it, stop it with miner.stop().

.....
INFO [07-19|14:40:03] 🔨 mined potential block                  number=13 hash=188f37…47ef07
INFO [07-19|14:40:03] Commit new mining work                   number=14 txs=0 uncles=0 elapsed=196.079µs
> miner.stop()
> eth.getBalance(eth.coinbase)
65000000000000000000
> eth.getBalance(eth.accounts[0])
65000000000000000000

The first account in the account list is the one that has been earning the ether for its mining, so all we prove above is that eth.coinbase == eth.accounts[0]. Now we’ve got some ether in the first account, let’s send it to the 2nd account we created. The source account has to be unlocked before it can send a transaction.

> eth.getBalance(eth.accounts[1])
0
> personal.unlockAccount(eth.accounts[0])
Unlock account 0xf74afb1facd5eb2dd69feb589213c12be9b38177
Passphrase: 
true
> eth.sendTransaction({from: eth.accounts[0], to: eth.accounts[1], value: web3.toWei(3,"ether")})
INFO [07-19|14:49:12] Submitted transaction                    fullhash=0xa69d3fdf5672d2a33b18af0a16e0b56da3cbff5197898ad8c37ced9d5506d8a8 recipient=0xf0a3cf66cc2806a1e9626e11e5324360ee97f968
"0xa69d3fdf5672d2a33b18af0a16e0b56da3cbff5197898ad8c37ced9d5506d8a8"

For this transaction to register it has to be mined into a block, so let’s mine one more block:

> miner.start(1)
INFO [07-19|14:50:14] Updated mining threads                   threads=1
INFO [07-19|14:50:14] Transaction pool price threshold updated price=18000000000
null
> INFO [07-19|14:50:14] Starting mining operation 
INFO [07-19|14:50:14] Commit new mining work                   number=14 txs=1 uncles=0 elapsed=507.975µs
INFO [07-19|14:51:39] Successfully sealed new block            number=14 hash=f77345…f484c9
INFO [07-19|14:51:39] 🔗 block reached canonical chain          number=9  hash=2e7186…5fbd96
INFO [07-19|14:51:39] 🔨 mined potential block                  number=14 hash=f77345…f484c9

> miner.stop()
true
> eth.getBalance(eth.accounts[1])
3000000000000000000

One small point: the docs talk about miner.hashrate. This no longer exists, you have to use eth.hashrate if you want to see mining speed.

Add a very simple contract

The example contract is based on an example in the Solidity docs. There is no straightforward way to compile a contract into geth. Browser-solidity is a good online resource but I want to stick to the local server as much as possible for this posting. Save the following contract into a text file called simple.sol

pragma solidity ^0.4.13;

contract Simple {
  function arithmetics(uint _a, uint _b) returns (uint o_sum, uint o_product) {
    o_sum = _a + _b;
    o_product = _a * _b;
  }

  function multiply(uint _a, uint _b) returns (uint) {
    return _a * _b;
  }
}

And compile it as below:

ethuser@host01:~/contract$ solc -o . --bin --abi simple.sol
ethuser@host01:~/contract$ ls
Simple.abi  Simple.bin  simple.sol

The .abi file holds the contract interface and the .bin file holds the compiled code. There is apparently no neat way to load these files into geth so we will need to edit those files into scripts that can be loaded. Edit the files so they look like the below:

ethuser@host01:~/contract$ cat Simple.abi
var simpleContract = eth.contract([{"constant":false,"inputs":[{"name":"_a","type":"uint256"},{"name":"_b","type":"uint256"}],"name":"multiply","outputs":[{"name":"","type":"uint256"}],"payable":false,"type":"function"},{"constant":false,"inputs":[{"name":"_a","type":"uint256"},{"name":"_b","type":"uint256"}],"name":"arithmetics","outputs":[{"name":"o_sum","type":"uint256"},{"name":"o_product","type":"uint256"}],"payable":false,"type":"function"}])

and

ethuser@host01:~/contract$ cat Simple.bin
personal.unlockAccount(eth.accounts[0])

var simple = simpleContract.new(
{ from: eth.accounts[0],
data: "0x6060604052341561000f57600080fd5b5b6101178061001f6000396000f30060606040526000357c0100000000000000000000000000000000000000000000000000000000900463ffffffff168063165c4a161460475780638c12d8f0146084575b600080fd5b3415605157600080fd5b606e600480803590602001909190803590602001909190505060c8565b6040518082815260200191505060405180910390f35b3415608e57600080fd5b60ab600480803590602001909190803590602001909190505060d6565b604051808381526020018281526020019250505060405180910390f35b600081830290505b92915050565b600080828401915082840290505b92509290505600a165627a7a72305820389009d0e8aec0e9007e8551ca12061194d624aaaf623e9e7e981da7e69b2e090029",
gas: 500000
}
)

Two things in particular to notice:

  1. In the .bin file you need to ensure that the from account is unlocked
  2. The code needs to be enclosed in quotes and begin with 0x

Launch geth as before, load the contract scripts, mine them into a block and then interact with the contract. We won’t be able to do anything useful with the contract until it’s mined, as you can see below.

ethuser@host01:~$ geth --datadir node1 --networkid 98765 console
INFO [07-19|16:33:02] Starting peer-to-peer node instance=Geth/v1.6.7-stable-ab5646c5/linux-amd64/go1.8.1
....
> loadScript("contract/Simple.abi")
true
> loadScript("contract/Simple.bin")
Unlock account 0xf74afb1facd5eb2dd69feb589213c12be9b38177
Passphrase: 
INFO [07-19|16:34:16] Submitted contract creation              fullhash=0x318caec477b1b5af4e36b277fe9a9b054d86744f2ee12e22c12a7d5e16f9a022 contract=0x2994da3a52a6744aafb5be2adb4ab3246a0517b2
true
> simple
{
....
  }],
  address: undefined,
  transactionHash: "0x318caec477b1b5af4e36b277fe9a9b054d86744f2ee12e22c12a7d5e16f9a022"
}
> simple.multiply
undefined
> miner.start(1)
INFO [07-19|16:36:07] Updated mining threads                   threads=1
...
INFO [07-19|16:36:21] 🔨 mined potential block                  number=15 hash=ac3991…83b9ac
...
> miner.stop()
true
> simple.multiply
function()
> simple.multiply.call(5,6)
30
> simple.arithmetics.call(8,9)
[17, 72]

Set up a second node in the network

I’ll run the second node on the same virtual machine to keep things simple. The steps to take are:

  1. Make sure the existing geth node is running
  2. Create an empty data directory for the second node
  3. Add accounts for the second geth node as before
  4. Initialise the second geth node using the same genesis block as before
  5. Launch the second geth node setting bootnodes to point to the existing node

The second geth node will need to run on a non-default port.

Find the enode details from the existing geth node:

> admin.nodeInfo
{  enode: "enode://08993401988acce4cd85ef46a8af10d1cacad39652c98a9df4d5785248d1910e51d7f3d330f0a96053001264700c7e94c4ac39d30ed5a5f79758774208adaa1f@[::]:30303", 
...

We will need to substitute [::] with the IP address of the host, in this case 127.0.0.1

To set up the second node:

ethuser@host01:~$ mkdir node2
ethuser@host01:~$ geth --datadir node2 account new
WARN [07-19|16:55:52] No etherbase set and no accounts found as default 
Your new account is locked with a password. Please give a password. Do not forget this password.
Passphrase: 
Repeat passphrase: 
Address: {00163ea9bd7c371f92ecc3020cfdc69a32f70250}
ethuser@host01:~$ geth --datadir node2 --networkid 98765 init genesis.json
INFO [07-19|16:56:14] Allocated cache and file handles         database=/home/ethuser/node2/geth/chaindata cache=16 handles=16
...
INFO [07-19|16:56:14] Writing custom genesis block 
INFO [07-19|16:56:14] Successfully wrote genesis state         database=lightchaindata                          hash=dd3f8d…707d0d
ethuser@host01:~$ geth --datadir node2 --networkid 98765 --port 30304 --bootnodes "enode://08993401988acce4cd85ef46a8af10d1cacad39652c98a9df4d5785248d1910e51d7f3d330f0a96053001264700c7e94c4ac39d30ed5a5f79758774208adaa1f@127.0.0.1:30303" console

Wait a little and you will see block synchronisation taking place

> INFO [07-19|16:59:36] Block synchronisation started 
INFO [07-19|16:59:36] Imported new state entries               count=1 flushed=0 elapsed=118.503µs processed=1 pending=4 retry=0 duplicate=0 unexpected=0
INFO [07-19|16:59:36] Imported new state entries               count=3 flushed=2 elapsed=339.353µs processed=4 pending=3 retry=0 duplicate=0 unexpected=0

To check that things have fully synced, run eth.getBlock('latest') on each of the nodes. If things aren’t looking right then use admin.peers on each node to make sure that each node has peered with the other node.

Now you can run the miner on one node and run transactions on the other node.

Notes

[1]  https://github.com/ethereum/go-ethereum/issues/3793:

Compiling via RPC has been removed in #3740 (see ethereum/EIPs#209 for why). We will bring it back under a different method name if there is sufficient user demand. You’re the second person to complain about it within 2 days, so it looks like there is demand.

[2] The official guide to Contracts is out of date. I spotted some out of date material on Mining and submitted an issue but updating the official docs doesn’t seem much of a priority so I figured I would collect my learnings here for now.
[3] You can pre-allocate ether in the genesis.json if you prefer, but that would mean a little more cut and paste which I am doing my best to minimise here.

Learning Haskell gave me a whole new outlook on programming

A while back I decided to learn Haskell as a counterpoint to Ruby/Java etc that I was more familiar with previously. I am very grateful for the new perspective that learning Haskell has given me. In particular:

  • I always used to start with database tables and a user interface. Haskell forces you to think about the functions and how they work on data structures. Not having to think about data storage is strangely liberating as it is much easier to change your data structures if there is no UI or database to worry about.
  • It might need a lot more thinking to write a piece of code, but by the time you’ve finished it invariably feels very elegant.
  • Writing pure code and then wrapping IO around it later really forces you to write code that is simpler and more testable.
  • I have an appreciation of how intimidating jargon can confuse and put people off, unnecessarily. For example, Haskell purists might want you to grok Category Theory even if it’s not that relevant to writing Haskell apps.

The learning curve has been ridiculous. Have a look at Tymon Tobolski’s comparison of using Ruby vs Haskell in a small Redis application. The Haskell version is terse to the point of being incomprehensible at the outset. As a Haskell learner I couldn’t find something that packed a similar “wow” to “A blog in 15 minutes with Ruby on Rails”

Here’s a small Scotty/Warp Haskell app I put together recently. (Scotty is Haskell’s answer to Sinatra for you Rubyists).

Impress your CTO – Track Everything

The day after your new feature goes live, someone will want to know how well it’s working (or not). They don’t just mean “are there any exceptions in the logs”. They mean “how are people using it, if at all?”

Hopefully, you had thought about unspecified requirements and so you already implemented something that they can see. It doesn’t even have to be super heavyweight. I’ve seen good stuff done using something sophisticated like Heap Analytics, and also seen people be able to get actionable insights from something as simple as Google Analytics.

Whatever you choose to do, make sure that there is an easy way for a non-developer to dig around the stats, and ideally download some raw data to play around with offline.

Just be sure to track everything.

 

Impress your CTO – avoid these boring complaints

Here are some of the most boring complaints I hear.

  • No-one knows how it works
  • You can’t measure the quality of what I’m doing
  • It’s because of all of our technical debt
  • We need to re-write this from scratch

They each contain an element of truth but at the same time they manage to completely miss the point. Hence boring.

No-one knows how it works

This really means “I don’t know how it works and nor does the person I usually work closest with”. I once saw a developer spend a huge amount of time trying to recreate, by reading the friendly source, the possible state transitions for a particularly key entity. Because “no-one understood it”. This had two problems:

  1. He may have missed something
  2. By reverse engineering requirements from the working code, what he ended up describing might have included bugs rather than how the feature was supposed to work.

And in this particular instance I could have pointed the developer to one person still in the business and one person who had left but would still be helpful. Next time you’re tempted to assert “no-one understands these state transitions” just change it to a question: “Is there anyone either on the team here, or who has left but is still friendly, who can help me understand how these state transitions are supposed to work?”

You can’t measure the quality of what I’m doing

This is invariably an attempt to hide something. I once worked with a team who didn’t report their test coverage because the lead developer felt that software is too complicated for a metric as simple as test coverage to be meaningful. We debated the subject and eventually agreed that although 100% coverage is probably not that meaningful, it is worth at least knowing where you are. Where were they when they measured code coverage? About 15%. I was amazed. Here we were debating the costs and benefits of 90% or 100% code coverage and all the time we were staring 15% in the face. I cannot think of anyone who would seriously argue that code coverage of 15% is in any way acceptable. For sure, you can’t measure everything, but the skill of a good developer is in helping finding a useful metric to use. For example on a recent project we agreed on a simple metric that if RubyCritic gives us an  A or a B grade then that’s good, if it’s any worse then we need to know why. It’s not perfect but it’s a lot better than hiding behind “you can’t measure what I’m doing”.

It’s because of all of our technical debt

As an experiment, I once agreed with a team to spend one month just doing technical debt cleanup. The results? Nothing noticeably better or faster for the users, nothing notably better quality as far as the QA people were concerned, no metric to show anything had improved, and I was still getting the same developers 3 months later making the same complaints about technical debt. The reality is that there will always be technical debt of some shape or form. Just like real world debt, some technical debt can be good if it helps you achieve other ends that you couldn’t otherwise achieve. Better developers would have a plan that says, for example, “Technical debt in module X is causing us problems, in order to fix those we will need to do Y”. This is better because it is specific and measurable and, if defined well enough, deliverable.

We need to re-write this from scratch

Stop thinking that. It’s a dreadful idea. Whenever you think of re-writing you are thinking of the 20% of the system that is a PITA to maintain. You’re not thinking of the 80% of the system that you will also need to re-write. I remember one project where a 5-month re-write was still going on 18 months later, still with no end in sight. And another where a re-architected system was being built in parallel but wasn’t able to keep up with the new features being added to the “legacy” platform. In shoert I’ve never seen a complete re-write that people were glad to have done. If you do need to make dramatic changes then you will need to find some way to change very specific parts of the application one by one. It will take a very long time: make sure you do your homework before advocating this.

Impress your CTO – Define your own NFRs

In the last instalment I talked about unspecified requirements. These are the ones that your product owner takes for granted: Of course the system should export to Excel; of course it should authenticate with Facebook; of course the system should load pages blazingly fast irrespective of how much data is thrown at it.

The most common of these unspecified requirements are the Non-Functional Requirements (NFRs). And the most common of these NFRs is “how long should the response time be”. So I find it surprising that response times are very rarely (if ever) mentioned during requirements definition work. Avoiding the topic early in the project is a sure way to have problems later on in the lifecycle.

It’s not uncommon to experience a conversation like this:

Sales: “This software is dreadful, it just took forever to load the dashboard in a crucial demo”
Dev: “Let’s see what’s up… oh yes you created a wizzabanga with 38 different permutations”
Sales: “Well yes of course I did. Then it took forever to load into the dashboard. Your software sucks.”
Dev: “But we didn’t have any NFRs”
Sales: “What’s an NFR”
… some tedious conversation ommitted ….
Dev: “So give me an NFR”
Sales: “OK, I want the page to load up in 200ms even if I’ve got 1000 wizzabangas each with 100 permutations”
Dev: “Hmmm… going to cost you”

etc

It’s meaningless to ask your user base for open-ended NFRs. Clearly they want everything to be really fast and really easy and really secure and ready next week. Much more useful would be for you to set out some reasonable NFRs that you think are deliverable in a reasonable timeframe and, even impose some sensible limits or warnings in the system to ensure those NFRs are supportable. Then at least you have an NFR, even if it’s one that you created.

For example if you think it’s reasonable for the page to load “fast enough” if your wizzabanga have up to 10 permutations then either impose a limit in the UI, or even just a sensible warning, “e.g. we recommend that you have no more than 10 permutations in your wizzabanga. You can add more but please note that you will need to be patient when loading larger wizzabangas.”

Impress your CTO (3)

Imagine the following conversation between a product owner and developer

Product Owner: “I want a CRM system”
Developer: “What does that do?”
PO: “It’s a customer database that lets me manage and report on my communications with my customers”
Dev: “That sounds easy, I’ll build you one this iteration”

One week later …

Dev: “Here you go. You log in here and here’s a screen where you can add a record for each customer. When you click into a customer record you can also add some notes for each time you’ve talked to them”
PO: “Wow, you did all that in a week, awesome. Now let’s add in the ability to make some notes for future calls that I need to do and a screen to show what tasks I have upcoming”
Dev: “No problem”

One week later …
PO: “This is so cool”

Brand new projects often start like this. But it doesn’t take long for fatigue to kick in. A few more iterations and all of a sudden you’re getting bogged down in details like:

  • Download to Excel
  • Upload from Excel
  • The fact that you should really be validating postcodes
  • And show the location on a map
  • And have better collaboration facilities
  • And handle customer segmentation
  • Ability to handle email templates
  • And initiate voice calls
  • And route incoming calls to an appropriate agent
  • And let someone apply a credit to a customer’s account
  • etc, etc, etc

At this point you start to realise why no-one sane would build their own CRM.

The Importance of Unspecified Requirements

You see you have functional requirements and you have non-functional requirements. But beyond all of these you have the unspecified requirements. These last ones are really important because your product owner considers them so obvious that they aren’t worth mentioning. Of course your CRM system has to handle loading records from Excel – only a buffoon would not know that!

The solution isn’t to insist on 100% detailed specifications before you start building. That way another type of madness lies. Nor should you consider your job well done because each week you have built what your customer asked for. The best developers are the ones who deliver the best working software, which is not as simple as building what was written down in the spec.

Worry about their unspecified requirements before you get too far down any path. It doesn’t have to involve much coding (if any). If someone asks you to build a CRM then do a sort of throwaway prototype first. Not one that involves any coding. (I’ve seen some great throwaway prototypes written using Excel). Why not get a one-month license to Salesforce (or whatever). This is a great way to get to see really what features they value, where their frustrations are etc. Then if you discover that Salesforce is the right answer for them then thank your lucky stars that you’ve had a narrow escape from attaching a massive bespoke millstone round your neck for the rest of the time with your company. And if you discover that there is something really special about your requirements, well, then you can build a truly great application that really addresses the real problems your users are facing.

 

Impress Your CTO (2)

People expect software to “just work”. You can guarantee that it won’t
“just work” when someone decides to throw more data at than you expected. So my 2nd tip is:

Enforce Sane Limits

From a developer point of view it makes sense to think in terms of 1 to Many or Many to 1. Why put in extra work to enforce some limitation when you can have unlimited flexibility for free?

Because the real world doesn’t work that way and it’s more important to model the real world than it is to create infinite flexibility. Some good reasons:

  1. Usability
  2. Performance
  3. Commercial reasons

Usability

The more “many’s” in your one to many, the more UI real estate you need to think about. Pagination, sorting, searching, exporting lists to Excel. Maybe favouriting so that people can shortlist the results because they can’t make sense of the whole list. Nothing here is super-complicated to code, but given that the scarce resource in most tech operations is developer time, as a great developer you will be ensuring that you are spending your valuable time on the most value-add activities.

Performance

Be honest. You aren’t about to write comprehensive automated performance tests. If you allow people to add unlimited items then eventually someone will do so and, probably sooner than you expect, you will experience serious performance issues. Which then means that you need to be spending valuable developer time addressing those performance issues because by the time you have enough traction to have performance issues you aren’t going to be able to withdraw the poorly performing feature [*].

Commercial Reasons

Maybe a surprise to see this as a reason but it may be the one you would do best to remember. The simple point here is that if the standard version of your product allows the user to add up to 15 items to a Thingamy then not only do you lower the risk of performance issues etc but you have a built-in mechanism that your product managers can use to upsell your customers to the next subscription level: “You want to handle more than 15 items? Let me put you through to our Enterprise Sales team”. If there is demand for the feature and customers will pay for it then fantastic – it will be a great feature to spend some real effort and attention to and give it a stunning user experience that performs really well.

Conclusion

I’m not saying to do anything complicated in your database. Leave the database handling a 1 to Many relationship. Just put check somewhere in your business logic. Next time you are discussing a feature with your product owner and you are thinking about the many side of the object model, just ask the question: “Would 5 be enough?”

[*] This is a case of “do as I say, not as I do” here. In my own side project which allows a user to merge and stitch together data from different Excel files I didn’t impose any limits. I asked some friends and family to test it and the second thing my brother did was try to crash it. It worked. So now I’ve implemented some limit checking.