The Bitcoin Non-standard

The Bitcoin Non-Standard

I wrote this guide on 3/30/2019. Standardness is node policy, not consensus, and may change from version to version. This guide may not be a complete reference.

Bitcoin transactions seem simple. As children, we were taught about money and how to make change with cardboard pennies, and we interact with cash and credit cards every day. So transactions feel friendly and intuitive to most adults. It's just sending money. Unfortunately, Bitcoin has an extra requirement that complicates transactions: every full node has to check each transaction multiple times. They check it the first time they see it (when it's broadcast to the network), and again when it appears in a block.

The process of multiple checks imposes significant costs on a full node. Checking a single transaction is pretty cheap, but checking thousands gets pretty expensive. And that's assuming every transaction is simple! Unfortunately, Bitcoin allows us to build some very complex transactions. These transactions could have a lot of outputs, or a lot of signatures or maybe just a lot of junk. These weird transactions could be made by old software like the Satoshi (君が代は) client or by broken, bad, or malicious software. Any way you slice it, some transactions are more complex, and therefore more expensive to verify than others.

So what do we do to deal with these? We don't want to forbid unusual transactions outright, as someone could be relying on these features! We don't want to burn their money accidentally by prohibiting these transactions. Instead, we discourage abnormal transactions by defining rules for "standard" transactions. Standard transactions get accepted by default-configured nodes and relayed to other nodes on the network. Eventually, a standard transaction will reach a miner and end up in a block.

Non-standard transactions, in contrast, will not be allowed in the mempool of default-configured nodes, unless they're included in a block first. As a result, these transactions are not broadcast or relayed through the network. That said, if a miner sees that transaction, she is free to include it in a block (if it passes all validity checks). All nodes will accept the non-standard transaction once it's in a block. Therefore, while non-standard transactions can be made, they have to be submitted directly to a miner rather than broadcast as a normal transaction, resulting in systemic inefficiency.

This post is a deep dive into non-standardness. It details every check that can trigger a REJECT_NONSTANDARD response. Like my previous post on Bitcoin timelocks, the goal is to aggregate information from the codebase and elsewhere to present a comprehensive reference for developers using Bitcoin.

REJECT_NONSTANDARD

Transaction and block validation are handled in validation.cpp, a sensibly named file. It contains dozens of validity checks that result in rejection codes like REJECT_DUPLICATE for transactions already seen or REJECT_INVALID for transactions and blocks that are faulty in some way. Eleven of the transaction checks on mempool admission can result in REJECT_NONSTANDARD. Each of these rejections is tagged with a user-facing error message like "non-BIP68-final". The list of checks below use error messages whenever possible, but IsStandardTx could result in any of several codes, so I've listed its function name instead.

  1. IsStandardTx
  2. tx-size-small
  3. non-final
  4. non-BIP68-final
  5. bad-txns-nonstandard-inputs
  6. bad-witness-nonstandard
  7. bad-txns-too-many-sigops
  8. too-long-mempool-chain
  9. too many potential replacements
  10. replacement-adds-unconfirmed
  11. non-mandatory-script-verify-flag

Many of these checks are relatively simple, but the first one is extremely complex. We will tackle them by order of complexity, starting with the simplest, and ending with the most complex.

tx-size-small

If a transaction, without witnesses, is smaller than MIN_STANDARD_TX_NONWITNESS_SIZE bytes, it is considered non-standard. The minimum transaction size is set to 82 bytes in policy.h (along with many other constants we'll reference today), as 82 bytes is the smallest possible p2wpkh transaction. That's 4 bytes for the version, 42 bytes for a single input, 32 bytes for a single output, 4 bytes for the timelock field. Anything smaller than that is using an unknown type of output script to shave bytes. Unknown types of output scripts are discouraged, so we reject transactions smaller than 82 bytes.

non-final & non-BIP68-final

These are the codes for time-locked transactions. In this context "final" means "no longer time-locked," and "non-final" means "still time-locked." Check out my previous post, Bitcoin's Time Locks, for an in-depth look at timelock semantics. The main to understand is that transactions that are still time-locked are not just invalid, they're also non-standard. The "non-final" code indicates the absolute lock hasn't passed yet, while
"non-BIP68-final" indicates that the relative timelock based on sequence numbers has not elapsed (see full documentation in bip68).

One non-intuitive feature pops up here. Recall that absolute and relative timelocks may refer to either the block timestamp or block height. Transactions locked to a specific block height become standard as soon as the previous block is seen. Which is to say, a transaction locked to 5001 will be accepted and relayed by any node that has seen block 5000. On the other hand, because block timestamp rules are less predictable, we won't consider them standard until after we've seen a block in which they would be valid. This means that height-locked transactions are standard before they're valid, but timestamp-locked transactions don't become standard until they're valid.

too-long-mempool-chain

The mempool should only ever contain valid transactions that are eligible for inclusion in new blocks. As such, when reorgs happen, we have to clean out the mempool to maintain consistency. We don't know how many transactions we'll have to clear, or which transactions will become invalid, so we want to take precautions to ensure the maintenance won't be overly expensive. For instance, what if we had to drop 10,000 children because of a single bad parent? We'd have to crawl a gigantic portion of the mempool to manage one weird family. That sounds like an attack vector to me. For this reason, we put limitations on the relationships that any new transaction can have with those already in our mempool.

The default limits for this check (called DEFAULT_ANCESTOR_LIMIT, DEFAULT_ANCESTOR_SIZE_LIMIT DEFAULT_DESCENDANT_LIMIT, and DEFAULT_DESCENDANT_SIZE_LIMIT) are set in validation.h and can be configured via config options. A default-configured node won't allow more than 25 ancestors or descendants in the mempool, or more than 101 kB of either ancestors or descendants. These limitations mean that in the worst case scenario, the resource cost of evicting bad ancestors or descendants of any given transaction will be low.

too many potential replacements & replacement-adds-unconfirmed

These checks are part of the node's default mempool replacement policy. They're performed when a new transaction spends the same input as a transaction already in the mempool. The node will only keep one, and these checks help decide which one.

The first, "too many potential replacements" is notable for being the only non-hyphenated error message we'll deal with today. This message is triggered if a potential replacement would cause a large group of other replacements. Like the descendant issue above, this check aims to prevent large resource costs triggered by individual transactions. The default limit (called nConflictingCount and set in [validation.cpp (https://github.com/bitcoin/bitcoin/blob/0.18/src/validation.cpp)) restricts replacements to touching no more than 100 transactions in the mempool.

On the other hand, "replacement-adds-unconfirmed" only cares about ancestors. We don't want to replace a tx if the new transaction's parents are low-fee junk sitting in the mempool. While we could check all unconfirmed ancestors' fee rates for the new tx, for the moment we don't allow replacements to have new unconfirmed ancestors. If the replacement adds new unconfirmed ancestors, default-configured nodes won't accept or relay it.

non-mandatory-script-verify-flag

This is where complexity starts ramping up to deal with Bitcoin Script. Because Script is a powerful tool and completely user-defined, we want to be extremely careful about its usage. Signature checking, which Script does all the time, is possibly the most expensive part of evaluating transactions, so we need to make sure people aren't abusing it. Additionally, Script is a prime target for malleation (changing a transaction's id without changing its validity). To prevent this, we have "mandatory" script checks that render a transaction invalid, and "non-mandatory" script checks that render it non-standard. Generally, non-mandatory checks exist to prevent malleation.

Further reading about several script checks can be found on StackExchange.

Signature formatting

To be considered standard, any signature in a transaction must be strict DER-encoded, and its s value must be in the lower half of the finite field. We call this the Low-S rule. These rules refer to standardized cryptography, so if you don't know what any of those words mean, you're not alone. If you use a standard library like libsecp you don't have to worry about this at all.

NULLFAIL

Sometimes a developer might intentionally want a signature check in their script to fail so that another signature later in the Script can be checked instead. Anyone can generate any number of invalid signatures on a transaction. Any number you can imagine is an invalid signature! So, if a script requires an invalid signature, anyone could replace that invalid signature with a different one and malleate the transaction. Doing so would malleate legacy transactions (because all failed signatures were included in the transaction id).

Intentional failures are non-standard unless the signature is null (0). We call this rule NULLFAIL. As a result, while the transaction id can still be malleated, the malleated versions can't be broadcast. The NULLFAIL rule applies to some rarely seen scripts, but will probably never affect an average user.

SCRIPT_VERIFY_MINIMALDATA

This rule is another malleability restriction. Bitcoin Script represents numbers and other stack items as variable-length bytestrings. This means that in some cases we can insert extra zeroes in the most-significant side of a bytestring without changing the result of script execution. Think of it like replacing the number 37 with 0037 in your math homework. Technically you got the right answer, but the teacher is still annoyed. Scripts that push more than the absolute minimal amount of data are considered non-standard and are not relayed.

SCRIPT_VERIFY_NULLDUMMY

Oh boy, NULLDUMMY has a weird history. Satoshi (千代に), in his original Script implementation, designed made an off-by-one error in OP_CHECKMULTISIG and OP_CHECKMULTISIGVERIFY. As a result, both opcodes pop one extra argument off the stack. This argument (the dummy) does not affect the script but is still required. The dummy argument could be anything, which means that someone could (once again!) malleate the transaction by replacing the dummy with any other number. The NULLDUMMY rule requires the dummy to be exactly null (0). If the dummy is anything else, the transaction is non-standard and won't be relayed.

SCRIPT_VERIFY_DISCOURAGE_UPGRADABLE_NOPS

As I mentioned in my timelocks post, OP_CHECKLOCKTIMEVERIFY and OP_CHECKSEQUENCEVERIFY replaced OP_NOP2 and OP_NOP3 respectively. We've established a precedent that numbered OP_NOP opcodes could be used for anything in the future, but we discourage people from using them. As a result, default-configured nodes treat transactions with executed upgradable NOPs in their scripts as non-standard. So, if we ever replace OP_NOP4 with new functionality, old nodes that can't fully evaluate that Script will not admit these transactions to their mempool. This failsafe ensures that transactions that are invalid according to some future rule won't get stuck in the mempools of old nodes because they interpret them as valid.

Cleanstack

When a script finishes execution, it checks the stack. If the stack is empty, or the top item is zero, the execution fails. If the stack has contents, and the top item is non-zero, the execution succeeds. The Cleanstack rule adds two extra requirements. First, the stack must have exactly one item. Second, that item must be precisely True (1). The stack has to be clean. Requiring this prevents scripts from encoding extra data that are not critical to their execution, as that data could be malleated. Any transaction with a script that violates this rule is non-standard.

SCRIPT_VERIFY_DISCOURAGE_UPGRADABLE_WITNESS_PROGRAM

SegWit introduced the concept of "witness program versions." Each SegWit output commits to its program's version. While only version 0 exists today, there are 16 more version numbers available, and the use of any of these numbers in an output script is valid. However, because the program versions don't exist, any funds spent there will be anyone-can-spend outputs! As with the NOPs, we discourage the use of upgradable programs, because we don't know what they do. At present, any witness program above version 0 is non-standard.

SCRIPT_VERIFY_MINIMALIF

Conditionals, like OP_IF, allow developers to create complex scripts. Satoshi's (八千代に) original implementation of OP_IF accepted any non-zero argument as True, and any zero-argument as False, including negative-0, which he nonsensically decided should be a thing in Script. As such, any argument to OP_IF in a script can be malleated into many other arguments. To prevent this, nodes don't relay transactions unless they use the "minimal" argument for every OP_IF -- True (1), or False (0), and nothing else.

SCRIPT_VERIFY_WITNESS_PUBKEYTYPE

Public keys in elliptic curve cryptography are points on a curve. This curve is symmetric across the x-axis, which means that for any given x value, there are exactly two y-values. So, any given point on the curve can be uniquely identified by either the full (x, y) coordinate pair, or by the x-coordinate and a single bit identifying which y-coordinate to use. The full (uncompressed) point is 65-bytes in Bitcoin (1 byte for a marker, 32 for x, and 32 for y). To save space, we tend to use the compressed public key, which is only 33 bytes (1 byte for a marker telling you which y to use and 32 bytes for x), saving 32 bytes per pubkey. Transactions can have many public keys, so the space savings add up substantially over time. To encourage the use of compressed keys, any witness with an uncompressed key is considered non-standard.

SCRIPT_VERIFY_CONST_SCRIPTCODE

Satoshi (細石の巌となりて) designed an interesting system to prevent reuse of the same signature from satisfying multiple OP_CHECKSIGs in the same script. Part of this system was a function called FindAndDelete informed by a special opcode called OP_CODESEPARATOR. Together, these would be used to modify the script to form the scriptcode, to which each signature would commit. In other words, Script developers can be picky about signatures created for specific subsections of their scripts. Unfortunately, FindAndDelete turned out to be a nightmarish black hole with infinite problems that nobody wants to maintain or update. We can't forbid it entirely by soft-forking it out, because somebody may have money locked by a script that relies on it. Instead, we discourage it by making any transaction that uses it non-standard.

bad-txns-too-many-sigops

Alright! Those are all the non-mandatory Script verification flags. Now we get into the real fun: signature operations (or "sigops"). Bitcoin Script has very little functionality. As a result, it is cheap to evaluate. Signature verification is by far the most expensive operation performed. Bitcoin Script's simplicity allows for efficient static analysis of scripts. If we count the number of signature-checking operations without running the script, we can determine how expensive the script will be to run. If a transaction has more than MAX_STANDARD_TX_SIGOPS_COST across all of its scripts, it is non-standard.

In policy.h we see that MAX_STANDARD_TX_SIGOPS_COST = MAX_BLOCK_SIGOPS_COST/5;. We then can check consensus.h and see MAX_BLOCK_SIGOPS_COST = 80000;. So a standard transaction has at most 16,000 sigops cost. But how do we calculate sigops cost? If only there were a function used to calculate that.

GetTransactionSigOpCost

This function is found in tx_verify.cpp.

At this point let's take a deep breath, and try to remember how great SegWit is because it's about to complicate our lives. SegWit makes transactions cheaper by moving information from inputs into witnesses. These witnesses don't need to be stored forever or even validated by old nodes. As a result, all data and operations are cheaper in a witness than they are in a transaction. Unfortunately, that means that we have to distinguish between sigops in a witness, and a transaction body.

Categorizing sigops by their location goes back to the introduction of pay-to-scripthash in 2012. As a result, we sort sigops into three categories: body, p2sh, and witness. Each of these has a different cost function, which is calculated by various other functions.

The general outline is that we count the sigops from the transaction body, multiply that number by 4 (called the WITNESS_SCALE_FACTOR, set in consensus.h and then count and add in the sigops from p2sh redeemScripts and witnesses. Each of the unique counting functions may adjust its results slightly before returning its count. Intuitively, sigops are very expensive in the body of the transaction, and cheaper everywhere else.

GetSigOpCount

This function is found in script.cpp.

The lowest-level way to count sigops for a script is GetSigOpCount. It's a property of the CScript data structure and is called by all other sigop-counting methods. Simply put, it crawls the script and finds all opcodes that check signatures and then totals them. As a general guideline, OP_CHECKSIG and OP_CHECKSIGVERIFY count as one sigop. Whereas, OP_CHECKMULTISIG and OP_CHECKMULTISIGVERIFY count as more than one.

The function can be run in "accurate" (post-SegWit) mode or legacy mode. Surprisingly, the code calls the argument fAccurate, implying (correctly in my opinion) that up until v0.7 everything was inaccurate. Accurate mode has reduced, variable costs for multisig operations, while inaccurate mode has high fixed costs. When running GetSigOpCount with fAccurate set to true, OP_CHECKMULTISIG(VERIFY) counts as one sigop for each public key it takes as input. For example, a 2-of-3 multisig would count as three sigops, while a 7-of-16 would count as 16. When running GetSigOpCount with fAccurate set to false, all multisig opcodes count as 20 sigops.

GetLegacySigOpCount

This function is found in tx_verify.cpp.

This function counts all signature checking operations in the body of the transaction. This includes any OP_CHECKSIGs found in legacy p2pkh outputs, any OP_CHECKMULTISIG in bare multisig outputs, and any odd-duck signature checks in an input's scriptSig (not counting the redeemScript, which gets checked later). This function calls GetSigOpCount(false), which indicates we should use the more-expensive legacy parsing. That, along with the scaling performed on the result within GetTransactionSigOpCost, means that multisig operations cost a whopping 80 sigops in the transaction body. Single signature opcodes, on the other hand, only cost four.

GetP2SHSigOpCount

This function is found in tx_verify.cpp.

We also want to count sigops in p2sh redeemScripts. To encourage people to use them instead of complex scripts in outputs, we make them cheaper. The result of this count will not be scaled by four like in GetLegacySigOpCount, and since we'll call GetSigOpCount(true) the multisigs are cheaper. A 2-of-3 here is worth three sigops, while a single signature is worth one.

WitnessSigOps

This function is found in interpreter.cpp.

Sigops located in witnesses are counted similarly to those in redeemScripts. If it's a pay-to-witness-pubkeyhash (p2wpkh) witness, then it's worth one sigop. Otherwise, it's a pay-to-witness-scripthash (p2wsh) witness, and we count the sigops in the witnessScript with GetSigOpCount(true). Therefore, as with p2sh, a 2-of-3 here is worth three sigops, while a single signature is worth one.

Review

Transactions with more than 16,000 sigops are non-standard. Here's a few quick examples of scaled sigop costs for common structures:

# p2pkh output has a OP_CHECKSIG. so 1 there
# then x4 for being in the tx body
p2pkh_output_sigops = 4  
p2sh_output_sigops = 0  
p2wpkh_output_sigops = 0  
p2wsh_output_sigops = 0

# A bare msig has an OP_CHECKMULTISIG
# so 20 for the legacy GetSigops count
# then x4 for being in the tx body
bare_msig_output_sigops = 80

p2pkh_input_sigops = 0

# msigs: one for each pubkey, because fAccurate = true
two_of_three_msig_p2sh_input_sigops = 3  
two_of_seven_msig_p2wsh_input_sigops = 7

# one for the implied CHECKSIG in the witness program
p2wpkh_input_sigops = 1  

bad-witness-nonstandard

Let's step back from sigops for a moment. Don't worry, I love mechanically counting bytes too, so we'll get back to them over and over and over.

The IsWitnessStandard function is found in policy.cpp.

Witness standardness rules aim to prevent excessively large or expensive witnesses from burdening the network. They evaluate each witness present, checking their association with their prevout as well as their stack items.

If the witness is a coinbase transaction or is empty, it gets a free pass. Coinbase witnesses are blank, and empty witnesses can't very well be expensive to parse now, can they?

Provided that there are witness data to evaluate, we start by looking up the prevout so we can check the scriptPubkey. If the prevout's scriptPubkey is not p2sh or a witness program, then we reject the witness and treat the transaction as non-standard.

If the scriptPubkey is p2sh, we also try to parse the associated input's scriptSig to get the redeemScript. This attempt accounts for the intermediate witness-over-scripthash construction, which puts a witness program in the input's scriptSig. If we have any trouble extracting the reedmscript, we reject the witness and treat the transaction as non-standard (I'm going to be repeating this a lot, so better get used to it).

Now we have "prevScript." The value of the prevScript is either the prevout's scriptPubkey (if the prevout was a native SegWit witness program) or the input's redeemScript (if the prevout was a witness-over-script p2sh output). We expect this prevScript to be a witness program (starting with either 0014 for wpkh or 0020 for wsh). If it's not a witness program, then we (say it with me!) reject the witness and treat the transaction as non-standard.

The prevScript will be either a wpkh or wsh witness program because we've already rejected everything else! If we have a wpkh program, we're done and the witness is standard, but if it's a wsh program, we have a few more steps to go through. The whole goal is to prevent resource consumption attacks, and while the resource usage of a wpkh program is fixed (one sigop and ~100 bytes in two witness stack items), the resource usage of a wsh program needs to be estimated.

There are three limits on p2wsh witnesses set in policy.h:

  1. MAX_STANDARD_P2WSH_SCRIPT_SIZE = 3600
  2. MAX_STANDARD_P2WSH_STACK_ITEMS = 100
  3. MAX_STANDARD_P2WSH_STACK_ITEM_SIZE = 80

Enforcing these is quite simple. The last item in the witness is the witnessScript. It shouldn't be longer than 3600 bytes. All other items in the witness are the stack. There should be no more than 100 stack items (not counting the witnessScript). None of these stack items should be longer than 80 bytes.

Given that a DER-formatted signature (the largest standard structure we regularly deal with in scripts) tops out at 74 bytes, 80 should be plenty. This upper-limit puts the maximum standard size of a witness at 11,600 bytes. That seems like plenty of space. And as usual, if any of these restrictions are violated, we reject the witness and treat the transaction as non-standard.

bad-txns-nonstandard-inputs

The AreInputsStandard function is found in policy.cpp.

Input standardness rules prevent excessively expensive scriptSigs. They feature the reappearance of our old friend: sigops. Fortunately, input standardness is much simpler to verify than witness standardness. This function handles everything that requires looking up the prevout. Some checks that don't require looking up the prevout are performed in IsStandardTx before this function gets called.

For each input, we look up its prevout and pass it to the Solver. If the prevout type is TX_NONSTANDARD, then we reject this input and treat the whole transaction as non-standard. Don't worry about the Solver and TX_NONSTANDARD yet; we'll come back to them in-depth in our final section. If the prevout type turns out to be p2sh, we need to check the redeemscript. We'll try to parse the scriptSig to get it. If we can't parse it as a script, or it makes an empty stack, then something fishy is up. We expected a redeemScript, and one isn't there. Fishiness is the opposite of standardness, so we reject the input and treat the transaction as non-standard.

Finally, we need to check redeemScript's sigops. If GetSigOpCount(true) > MAX_P2SH_SIGOPS we reject the input. In policy.h MAX_P2SH_SIGOPS gets set to 15. So a p2sh redeemScript can have at most 15 signers (p2wsh witnessScripts can have more). Any more than that and the transaction becomes non-standard.

IsStandardTx

The IsStandardTx function is found in policy.cpp.

Aaaaand we saved the largest and most complex for last. As its name implies, this function checks the transaction as a whole. It performs syntactic checks on the transaction without the need to reference the prevouts or check signatures. As a result, it has low resource costs. For this reason, it is the first check performed. Expensive checks are performed only after the cheaper ones have passed.

General transaction checks

First, we reject transactions with versions other than 1 and 2. Technically, we check that the version is between 1 and MAX_STANDARD_VERSION (set to 2 in transaction.h), but it seems unlikely we'll see a new transaction version any time soon, given that it took seven years to reach version 2. Anyway, transactions with high version numbers are non-standard and get rejected with the error code "version."

Next, we check the transaction weight, against a constant, MAX_STANDARD_TX_WEIGHT, set to 400,000 in policy.h. The transaction's weight is related to its size in bytes. Each byte of tx body counts as four (WITNESS_SCALE_FACTOR) weight. Each byte of witness data counts as one weight. The body has to be saved forever, while the witness can be pruned, so it makes sense that it would cost more. If the total weight is greater than the maximum weight, the transaction is non-standard and rejected with the error code "tx-size."

Input checks

Next, we check each input's scriptSig. Each scriptSig should be no more than 1,650 bytes. Any larger, and we reject it with the error code 'scriptsig-size'. After that, we verify that the scriptSig only contains data pushes. Satoshi's (苔の生す迄) original design allowed executable opcodes in the scriptSig, which lead to a plethora of issues (like the OP_RETURN bug, CVE-2010-5141) and malleability vectors. The strange part: allowing those non-push opcodes in the scriptSig served no apparent purpose at all. Now, if there are any non-push opcodes in the scriptSig, we reject the transaction with the error code "scriptsig-not-pushonly".

IsStandard & Solver

If the scriptSigs are good to go, we proceed to check the outputs. To do this, we use IsStandard which calls the Solver referenced earlier. Part of the script evaluation system, the Solver sorts scripts according to a set of known types. In other words, it determines whether Bitcoin Core knows what the script does. If it successfully determines a type, it returns that type and any public keys or hashes it finds in the script. I think Solver is a bit of a weird name for it; I'd probably call it Sorter or something, but hey I'm not Satoshi, so I don't get to decide these things. The Solver currently identifies seven known output types and can return two "unknown" variations.

  1. TX_SCRIPTHASH
  2. TXWITNESSV0_KEYHASH
  3. TXWITNESSV0_SCRIPTHASH
  4. TX_PUBKEYHASH
  5. TXNULLDATA *
  6. TX_PUBKEY
  7. TX_MULTISIG *
  8. TXWITNESSUNKNOWN *
  9. TX_NONSTANDARD *

We pass each output to IsStandard, which uses the Solver to sort into an output type. The IsStandard function then decides whether the Solver's returned type is standard. Immediately, IsStandard rejects TX_WITNESS_UNKNOWN and TX_NONSTANDARD. These responses indicate that the Solver couldn't identify the type of the output, and they must be some abnormal script. The rejection is bubbled up by IsStandardTx with the error code "scriptpubkey".

Types listed above without asterisks pass IsStandard and return to IsStandardTx. We have a few more sanity checks for TX_NULL_DATA and TX_MULTISIG though. As always, these checks enforce resource usage limits that protect nodes from DoS attacks.

Typically we call TX_NULL_DATA an "opreturn output." It's an output with 0 value whose scriptPubkey starts with OP_RETURN. These outputs can have extra data after the OP_RETURN. If they do, we want to put a limit on it, to avoid bloat of the blockchain. If an opreturn output has more than 83 total bytes, it gets rejected by IsStandard. Again, the rejection is bubbled up by IsStandardTx with the error code "scriptpubkey".

Like opreturn outputs, TX_MULTISIG is better known by other names: "raw multisig" or "bare multisig." Today, most multisigs are built using p2sh or p2wsh outputs, but back in the old days, we just crammed the multisig into the scriptPubkey instead of modestly hiding it in the redeemScript. For a while, it was almost impossible to make a standard bare multisig transaction. Thankfully, the current implementation allows it for up to three pubkeys. If this limit is exceeded, or the script contains nonsense like 5-of-3 or 0-of-0, then IsStandard returns false, and (of course) the rejection is bubbled up by IsStandardTx with the error code "scriptpubkey".

Other output checks

We're in the home stretch now; there are only three simple checks left, so let's finish strong!

  1. Node operators can disable bare multisig entirely by setting the fIsBareMultisigStd config flag. If they do, all TX_MULTISIG results will be rejected with the error code "bare-multisig."
  2. Transactions must not create "dust" outputs. A dust output contains less value than it would cost to spend. This threshold is determined by calling GetDustThreshold, using DUST_RELAY_TX_FEE, a constant set to 3000 sat/kilobyte in policy.h. The minimum number of bytes required to spend an output is predictable based on the output's type. If the output would add more bytes to the spending transaction than its value can cover in fees, we consider it dust, and we reject it with the error code "dust."
  3. Finally, a transaction can have no more than one opreturn output. If IsStandard returns TX_NULL_DATA from the Solver for two or more outputs, the transaction is rejected with the error code "multi-op-return."

Conclusion

Whew! That's a lot! Hopefully, this gives you a good understanding of the concept of standardness in Bitcoin. We often underestimate its power, but it has a profound impact on app development. It controls what transactions regular users can submit to the network without specialized infrastructure (a direct connection to mining pools). As such, it determines entirely what software we as engineers can usefully create. Where consensus rules describe Bitcoin's functionality, standardness rules can affect its utility for any purpose, and relatively subtle changes have had large impacts in the past. Generally, the restrictions are common sense, but it is easy to see the potential for abuse of standardness rules to censor specific types of transactions. We must carefully observe governance and development processes around these rules to ensure that relay rules aren't used as a weapon or a tool of coercion.