Unless you are a Bayesian statistician, things in statistics will always be a little goofy. The Bayesians have extended probability to a coherent theory of stats. There is currently no competing theory of stats that is equally coherent (in the philosopher’s sense). Other than that, the unifying idea is often the “likelihood function.” Unfortunately neither Bayesian methods or likelihood are commonly taught at the introductory level.
But basic stats is not so goofy that a PhD statistician should not be able to explain why something works. (At least for low level versions of “why.”) To tie a lot of common methods together (regression, ANOVA, t-tests, ANCOVA, etc.) you just need to work the linear algebra muscles. :-)
There is one book, by David J. Saville and Graham R. Wood. The first version was called Statistical Methods: The Geometric Approach (Springer-Verlag, 1991). I am pretty sure that it has been revised, and I also think that they published another similar book.
It claims that it is introductory, but I don’t know. If you have survived a first class in stats (even a bad one) and have some linear algebra, and really want to learn some basic stats, I’d recommend you try to find a copy in a library. (I wouldn’t recommend buying it until you see it—math books are very odd: what one person loves the next person hates.)
But it does cover linear algebra (what the authors call “geometry”) and elementary stats. Focusing on regression and ANOVA methods. So it might be the ‘Statistics for people who actually want to understand it and already have a basis in linear algebra’ book you describe. Or at least a good iteration.
By the way, there is a lot more to probability than what they can teach in discrete math. :-) Probability can take you from the basement all the way up to the top of hard mathematics!