Just as a complement to the other excellent answers by Dave Tweed, supercat and Spehro Phefany, I'll add my 2 cents.
First a bit of nitpicking, as I wrote in a comment, the time constant is not defined as 63%. Formally it is defined as the inverse of the coefficient of the exponent of an exponential function. That is, if Q is the relevant quantity (voltage, current, power, whatever), and Q decays with time as:
Q(t)=Q0e−kt(k>0)
Then the time constant of the decaying process is defined as τ=1/k.
As others have pointed out, this means that for t=τ the quantity has decreased by about 63% (i.e. the quantity is about 37% of the starting value):
Q(τ)Q0=e−1≈0.367=36.7%
What other answers have only marginally touched is why that choice has been made.
The answer is simplicity: the time constant gives an easy way to compare the speed of evolution of similar processes. In electronics often the time constant can be interpreted as "reaction speed" of a circuit. If you know the time constants of two circuits it's easy to compare their "relative speed" by comparing those constants.
Moreover, the time constant is a quantity easily understandable in an intuitive way. For example, if I say that a circuit settles with a time constant τ=1μs, then I can easily understand that after a time 3τ=3μs (or maybe 5τ=5μs, depending on the accuracy of what you are doing) I can consider the transient ended (3τ and 5τ are the most common choices as rules of thumb for the conventional transient duration).
In other words the time constant is an easy and understandable way to convey the time scale on which a phenomenon occurs.