Whether accuracy is a good metric depend on the needs of the application, for instance, here is an example by @benoit_sanchez
You own an egg shop and each egg you sell generates a net revenue of 2
dollars. Each customer who enters the shop may either buy an egg or
leave without buying any. For some customers you can decide to make a
discount and you will only get 1 dollar revenue but then the customer
will always buy.
You plug a webcam that analyses the customer behaviour with features
such as "sniffs the eggs", "holds a book with omelette recipes"... and
classify them into "wants to buy at 2 dollars" (positive) and "wants
to buy only at 1 dollar" (negative) before he leaves.
If your classifier makes no mistake, then you get the maximum revenue
you can expect. If it's not perfect, then:
Then the accuracy of your classifier is exactly how close you are to
the maximum revenue. It is the perfect measure.
See also my answer to a related question at the stats stack exchange
I disagree with the other answers about accuracy and imbalance. If this is a problem, just look at the improvement in accuracy over just guessing the most common class, It really isn't a big deal. Something like:
$$\Delta_\mathrm{acc} = \frac{\mathrm{acc} - \pi}{1 - \pi}$$
where $\pi$ is the proportion of data belonging to the majority class. This is the proportion of the "residual" accuracy that is captured by the model above that captured just from the labels. Note this is an affine transformation of accuracy, so it is still measuring the same thing, just in a slightly more interpretable manner. In @Dave's example, $\Delta_\mathrm{acc}$ would be negative, which would be a clear indication that the model was worse than useless.
If different misclassifications have different costs, then you can use the expected loss i.e. a weighted accuracy, where the errors are weighted by their appropriate costs. Investigate "cost-sensitive learning" or "Bayesian decision theory" for more information.
Basically, to decide what metric to use, you need to be clear about exactly what is important for your application, and chose the metric (or metrics) accordingly. For "normal" applications, I tend to use accuracy (or expected loss in a cost-sensitive setting) to directly measure the quality of the decisions, Area under the ROC (which is a measure of the quality of the ranking of patterns) and cross-entropy/predictive information or Brier score as measures of the general calibration of probability estimates. The mistake is to think there is a "one-size-fits-all" solution to the problem, and it is best to think hard about which metrics are appropriate for each application and why.