## Some Video Learning Suggestions

During this COVID-tide many of us have been seeking out online learning resources.  I’ve done so quite a bit in the past few months, and I thought I would do a post to recommend some of these.  They are all sites that feature videos.  I’ll start with the math ones and then go on to other subjects.

Math

• Khan Academy.  This one should be obvious to anyone who knows anything about learning math online.  The videos on this site are quite good – among the best online math videos I’ve seen.  The site also features regular self-assessments ranging from quick quizzes to end-of-course tests so that you can gauge your learning.  From what I can tell, Khan Academy is the gold standard for online learning, and I have no problems recommending it to my children (ages 9, 9, and 12) or to my college students.  (While Khan Academy got their start with math, they’ve branched out to many other subjects as well.)
• Numberphile.  This is a YouTube channel sponsored by the Mathematical Sciences Research Institute.  It features a wide variety of mathematical topics generally aimed at those who know something about math and are curious to learn more.  I’ve only watched two Numberphile videos – both on Möbius strips – but they were both quite good, and I even learned something from them!  (I also showed them to my children.)  I wouldn’t normally recommend a YouTube channel based on just two videos, but their quality plus the fact that my college students have often spoken highly of Numberphile is sufficient to make me comfortable recommending this channel.
• Prodigy.  This is an online role-playing game in which players have to solve math problems in order to defeat monsters in battle.  While I have some pedagogical criticisms of Prodigy, the game has good production values for its target audience (mostly elementary school children), it successfully ramps up the difficulty level as players demonstrate greater math knowledge, and all three of my kids have had fun with it.

Geography

• Geography Now.  Paul Barbato deserves an award for this YouTube channel, in which he covers all the countries in the world in alphabetical order.  He’s funny and informative, giving an overview of the physical geography, the people, the culture, and the political relations of each country he discusses.  The production values are much better than any of the other general-interest geography channels I’ve seen, too.  He’s only up to the Seychelles at this point, so if you’re interested in countries like Spain or the UK you’ll have to wait a bit, but this has been the go-to site for geography homeschooling with my kids the past several months.  (“Hast du gluten-free?”  “Nein!” still brings a chuckle in my household.)
• Touropia.  This is the best YouTube travel channel I’ve found.  I generally pair a country’s Touropia “Top Ten Places to Visit” video with the country’s corresponding Geography Now video for two different perspectives on the same country.

Science

• National Geographic.  As you can imagine, this venerable institution has some high-quality videos out there.  I’m recommending them under “science” because we’ve only watched their science videos.
• Mystery Science.  This is a series of hands-on video lessons for K-5.  My children’s teachers used them in the classroom pre-COVID, and I’ve used several as well for homeschooling.  The science is solid and at the appropriate level, and the hands-on activities help engage the children.  When I was using them more regularly back in the spring, some of the video lessons were available for free and others you had to purchase.  As of right now they are offering a limited number of free memberships that presumably allow you to access all the videos.

History

• Crash Course in World History.  This YouTube series features an irreverent overview of world history that should appeal to people of all ages.  I haven’t watched any of their videos all the way through, but I’ve seen segments, my wife has used them when she’s homeschooling, and my kids have enjoyed them.  This link is just to the first video in the world history series, and they have other series as well (including U.S. history).  “Except for the Mongols!” has become a catchphrase in my house.

Philosophy

• Philosophy of the Humanities.  These YouTube videos simply feature Leiden University professor Victor Gijsbers giving a series of lectures on the philosophy of science, history, knowledge, and the humanities in general.  There’s nothing fancy here, video-wise, but Victor is an amazingly clear lecturer, elucidating intricate philosophical concepts with clarity and easy-to-understand examples.  This is the only recommendation in this post that I haven’t used with my kids; this is advanced high school level material at the very least.  Favorite quote: “Hegel is evidently a comic historian.  Which, by the way, doesn’t mean we will laugh a lot when we read Hegel.  Because, believe me, we don’t.”  (I actually know Victor in another context: He and I have both written interactive fiction!)

English

• Schoolhouse Rock.  If you were an American child in the late 70s or early 80s you will know exactly what this video series is, as it was ubiquitous during prime Saturday-morning cartoon-watching.  My kids and I viewed their videos on the eight parts of speech, but they have some history videos and even a solid video on how a bill becomes law as well.  Enjoy the 70s-style animation and music, and expect to emerge with a few extra earwigs.  (“Conjunction junction, what’s your function?” and half of the interjections video continue to be repeated in my house.)  The link is to the YouTube channel, which doesn’t have all the videos.  However, they are easy to find online.

French

• Get Started with French Like a Boss!.  I am not anywhere near fluent in French, but I did study it in high school and college, and I decided homeschooling would be a great opportunity to introduce my kids to a foreign language.  Lya, the host of this video that introduces basic French words and phrases, is funny and a little goofy in an endearing way.  She’s really good about using the vocabulary in different contexts, too, to the point that my 12-year-old was able to start picking up basic French sentence structure just from Lya’s examples.  This video is part of a larger series, but it’s the only one from that series I’ve seen.
• Rock ‘N Learn.  This brightly-animated series aimed at young children pairs French vocabulary with corresponding English vocabulary.  My kids are older than the target audience for the series, but the vocabulary is right at their level, and they’re old enough that they can laugh about how goofy it is.  Rock ‘N Learn actually covers a wide variety of subjects, but we’ve only watched it for French.  (Also, they appear to have a series of French language videos for teens and adults, but I haven’t watched those yet.)

## A Lesson on Converting Between Different Bases

We’re in the time of COVID-19, and that has meant taking far more direct responsibility for my children’s learning than I ever have before.  It’s been a lot of work, but it’s also been fun.  In fact, I’ve been surprised at how much I’ve enjoyed it.

One of these enjoyable aspects has been introducing my children to some mathematical concepts that are more advanced than they would normally get in third or sixth grade.  My sixth grader in particular is ready for some basic number theory, such as the representation of numbers in bases other than 10.

Here’s a problem I posed for him a few weeks ago, after making sure he understood the conversion concept.

Take the number 42178 and convert it to base 2.

Dutifully, he began converting 42178 to base 10.  It took him a minute or two, but he got the correct answer of 219110.  Then he started working on the conversion from base 10 to base 2.  I told him to tell me when he finished the calculation but not tell me what the answer is.  After another couple of minutes, he did so.  I then quickly wrote down the answer of 1000100011112 off the top of my head.  His eyes bugged and his jaw dropped – a response that is always gratifying to see from a middle-schooler. 🙂

I didn’t keep him in suspense long, though.  Since 8 is a power of 2, there’s a fast way to convert between those two bases.  In particular, 23 = 8, so you can convert the digits in the base-8 representation of a number in groups of three.  For the example of 42178, we have 48 = 1002, 28 = 0102, 18 = 0012, and 78 = 1112.  (All of these base-2 representations I had in my head.)  String those four together to get

42178 = 1000100011112.

This process goes in the other direction, too.  And let’s convert from binary to base 16, just to work with a different number than 8.  Thus, for example,

1101010001112 = D4716,

as 01112 = 716, 01002 = 416, and 11012 = D16.  (Note that we have to do the conversion starting with the least significant digit; i.e., from right to left.)

This process works when converting between any two bases where one base is a positive integer power of the other.

## A Coin-Flipping Problem

One problem that I’ve assigned when discussing Markov chains is to calculate the expected number of flips required for a particular pattern to appear.  (Here I mean a pattern such as heads, heads, heads, or HHH.)  In this post I’m going to discuss another approach – one that doesn’t use Markov chains – to solve this problem.

Suppose we want to find the expected number of flips required for the pattern HHH to appear.  Call this X.  We can calculate X by conditioning on the various patterns we might achieve that do or don’t give us HHH.  For example, for our first few flips we could observe T, HT, HHT, or HHH.  These cover the entire outcome space and have probability 1/2, 1/4, 1/8, and 1/8, respectively.

• If we observe T, then we effectively have to start the entire process over, and we’ve used one flip to get to that point.  So, if we observe T, then the average number of flips required is $1+X$.
• If we observe HT, then we also have to start the entire process over, and we’ve used two flips to get there.  For this outcome, the average number of flips required is $2+X$.
• If we observe HHT, then we have to start the entire process over, and we’ve used three flips.  The average number of flips required for this scenario is $3+X$.
• Finally, if we observe HHH, then we have achieved our goal.  The number of flips required is 3.

All together, then, the average number of flips required satisfies the equation

$\displaystyle X = 1/2(1+X) + 1/4(2+X) + 1/8(3+X) + 1/8(3).$

Solving for X, we obtain $X = 14$.  So it takes 14 flips, on average, to obtain three consecutive heads.

What if we have a more complicated pattern, though?  Let’s look at HHT as an example.  Let X be the expected number of flips required for HHT to appear for the first time.

Once again, we can condition on the various patterns we might achieve that do or don’t give us HHT.  These are T, HT, HHH, and HHT.  As in the previous example, these cover the entire outcome space and have probability 1/2, 1/4, 1/8, and 1/8, respectively.

• If we observe T, then we have to start over.  The average number of flips required is $1+X$.
• If we observe HT, then we have to start over.  The average number of flips required is $2+X$.
• If we observe HHH, then we don’t have to start over.  It’s entirely possible that the HH at the end of HHH could be followed by a T, and then we would have achieved our pattern!  Mathematically, this means that things get a bit more complicated.
• Let $E_{HH}$ denote the expected number of flips required for HHT to appear given that we currently have HH.
• Thus if we start with HHH, then the average number of flips required to obtain HHT is $3 + E_{HH}$.
• Now we need to determine $E_{HH}$.
• If we currently have HH, then the next flip is either a T, which completes our pattern, or an H, we means we must start over in our quest to complete HHT given that we currently have HH.
• Thus $E_{HH} = 1/2(1) + 1/2(1 + E_{HH})$.
• Solving this yields $E_{HH} = 2$.
• Thus if we observe HHH, then the average number of flips required to obtain HHT is 3 + 2 = 5.
• Finally, if we observe HHT, then we have achieved our goal in 3 flips.

Therefore, $X = 1/2(1 + X) + 1/4(2 + X) + 1/8(5) + 1/8(3)$.  Solving this equation for X gives us $X = 8$ flips.

Notice that it takes fewer flips, on average, to achieve HHT than it does HHH.  This is because we don’t always have to start over every time the sequence fails to mach our goal sequence.

The interested reader is invited to find the expected number of flips for the other sequences.

## A Request for a Proof of a Binomial Identity

A few weeks ago I received an email from Professor Steve Drekic at the University of Waterloo. He asked if I knew of a way to prove the following binomial identity:

$\displaystyle \sum_{k=1}^n \frac{1}{2k-1} \binom{2k}{k} \binom{2n-2k}{n-k} = \binom{2n}{n}.$

(He told me later that he wanted it to help prove that the random walk on the integers with transition probability $p = 1/2$ is null recurrent.)

It’s an interesting binomial identity, in that it has a nontrivial but not overly complicated sum on the left side and a simple expression on the right. Yet I had not seen it before. I was able to find a proof that uses generating functions, and I thought I would reproduce it here.

Lemma 1.

$\displaystyle \binom{1/2}{k} = \frac{-1}{2k-1} \binom{-1/2}{k}$.

Proof.
By Identity 17 in my book The Art of Proving Binomial Identities [1],

$\displaystyle \binom{1/2}{k} = \frac{(1/2)^{\underline{k}}}{k!} = \frac{(1/2) (-1/2)^{\underline{k}}}{(-1/2-k+1) k!} = \frac{(-1/2)^{\underline{k}}}{-(2k-1)) k!} = \frac{-1}{2k-1} \binom{-1/2}{k},$

where in the second step we use properties of the falling factorial $x^{\underline{k}}$.

Next, we need the following generating function.

Lemma 2.

Proof.
$\displaystyle - \sqrt{1-4x} = \sum_{k=0}^{\infty} \frac{1}{2k-1} \binom{2k}{k} x^k.$

By Newton’s binomial series (Identity 18 in my book [1]),
$\displaystyle -\sqrt{1-4x} = -(1-4x)^{1/2} = -\sum_{k=0}^{\infty} \binom{1/2}{k} (-4x)^k \\ = -\sum_{k=0}^{\infty} \frac{-1}{2k-1} \binom{-1/2}{k} (-4x)^k, \text{ by Lemma 1} \\ = \sum_{k=0}^{\infty} \frac{1}{2k-1} \left(\frac{-1}{4}\right)^k \binom{2k}{k} (-4x)^k, \text{ by Identity 30 in [1]} \\ = \sum_{k=0}^{\infty} \frac{1}{2k-1} \binom{2k}{k} x^k.$

Finally, we’re ready to prove the identity.

Identity 1.

$\displaystyle \sum_{k=1}^n \frac{1}{2k-1} \binom{2k}{k} \binom{2n-2k}{n-k} = \binom{2n}{n}.$

Proof.
Let $a_k = \binom{2k}{k}/(2k-1)$, and let $b_k = \binom{2k}{k}$. Since the generating function for $(a_k)$ is $-\sqrt{1-4x}$ (Lemma 2), and the generating function for $(b_k)$ is $1/\sqrt{1-4x}$ (Identity 150 in [1]), the generating function for their convolution,

$\displaystyle c_n = \sum_{k=0}^n \frac{1}{2k-1} \binom{2k}{k} \binom{2n-2k}{n-k},$

is, by the convolution property for generating functions (see Theorem 13 in [1]),

$\displaystyle - \frac{\sqrt{1-4x}}{\sqrt{1-4x}} = -1.$

Since $-1$ just generates the sequence $-1, 0, 0, 0, \ldots$, this means that

$\displaystyle \sum_{k=0}^n \frac{1}{2k-1} \binom{2k}{k} \binom{2n-2k}{n-k} = \begin{cases} -1, & \: n = 0; \\ 0, & \: n \geq 1.\end{cases}$

Therefore, when $n \geq 1$, we have
$\displaystyle \sum_{k=0}^n \frac{1}{2k-1} \binom{2k}{k} \binom{2n-2k}{n-k} = 0 \\ \implies \sum_{k=1}^n \frac{1}{2k-1} \binom{2k}{k} \binom{2n-2k}{n-k} - \binom{2n}{n} = 0 \\ \implies \sum_{k=1}^n \frac{1}{2k-1} \binom{2k}{k} \binom{2n-2k}{n-k} = \binom{2n}{n}.$

References

1. Michael Z. Spivey, The Art of Proving Binomial Identities, CRC Press, 2019.

## Strong Induction Wasn’t Needed After All

Lately when I’ve taught the second principle of mathematical induction – also called “strong induction” – I’ve used the following example to illustrate why we need it.

Prove that you can make any amount of postage of 12 cents or greater using only 4-cent and 5-cent stamps.

At this point in a class we would have just done several examples using regular induction, and so we would be naturally inclined to try to prove this postage stamp problem using that technique.  The base case is easy: Just use three 4-cent stamps to make 12 cents of postage.  The induction hypothesis is as usual, too: Suppose, for some $k \geq 12$, that we can make k cents’ worth of postage using only 4-cent and 5-cent stamps.  But regular induction fails at the induction step: To prove that you can make $k+1$ cents using only 4-cent and 5-cent stamps, knowing that you can make k cents isn’t helpful, since you can’t add either a 4-cent or a 5-cent stamp to k cents’ worth of postage to generate $k+1$ cents’ worth of postage.  Instead, you need to know that you can make, say, $k-3$ cents’ worth of postage.  Then you can add a 4-cent stamp to that amount to produce your $k+1$ cents.  In other words, you need to assume in the induction hypothesis not that you can make k cents but that you can make $12, 13, \ldots, k$ cents.  And that’s the essence of strong induction.

However, before we get to this realization my students will often suggest that we can produce $k+1$ cents from k cents by simply removing one of the 4-cent stamps used to produce k cents and replacing it with a 5-cent stamp.  This is a good idea.  However, it assumes that we actually used at least one 4-cent stamp to produce k cents, and that’s a faulty assumption.  Sometimes we don’t need a 4-cent stamp, such as if we use three 5-cent stamps to produce 15 cents.  (In fact, for 15 cents we must use only 5-cent stamps.)  If we don’t use any 4-cent stamps then we can’t generate $k+1$ cents from k cents by replacing a 4-cent stamp with a 5-cent one.

Normally the students are a little chagrined that this idea fails.  And that happened again this term.  After discussing the idea and why it doesn’t quite work I was about ready to move on and introduce strong induction when one of my students interjected, “I’ve got it!  I’ve got it!”  He pointed out that if we’re going to use only 5-cent stamps to generate k cents, when $k \geq 12$, then we’ll need at least three of them.  That’s true, I said.  Then, he added, if we have at least three 5-cent stamps to make k cents, we can replace them with four 4-cent stamps to yield $k+1$ cents’ worth of postage.  That covers the remaining case.  I was impressed enough that I applauded the class for their collective work in solving the problem.

To recap: You don’t actually need strong induction to solve the stamp problem described above.  Use regular induction, and break the induction step into two cases: (1) If you use at least one 4-cent stamp to make k cents, replace it with a 5-cent stamp to make $k+1$ cents.  (2) If you only use 5-cent stamps to make k cents, you’ll need at least three of them.  Replace those three with four 4-cent stamps to generate $k+1$ cents’ worth of postage.

Well done, Spring 2020 Math 210 students!

## Arguments for 0.9999… Being Equal to 1

Recently I tried to explain to my 11-year-old son why 0.9999… equals 1.  The standard arguments for $0.9999... = 1$ (at least the ones I’ve seen) assume more math background than he has.  So I tried another couple of arguments, and they seemed to convince him.

The first argument.  The usual claim you get from people who aren’t yet convinced that $0.9999... = 1$ is that 0.9999… is the real number just before 1 on the number line.  Let’s suppose this is true.  What is the average of 0.9999… and 1?  Stop and think about that for a bit.

If the average exists, it must be larger than 0.9999…, and it must be smaller than 1.  But if 0.9999… is just before 1 on the number line, there can’t be such a number.  So either (1) the averaging operation doesn’t apply to 0.9999… and 1, (2) there are numbers between 0.9999… and 1 on the number line, or (3) 0.9999… equals 1.  But (1) immediately leads to the question “Why not?”, which has no obvious answer, and (2) leads to the question of what their decimal representations would be, which also has no obvious answer.  Explanation (3), that 0.9999… equals 1, starts to look more plausible.

The second argument.  Again, let’s assume that 0.9999… is the real number just before 1 on the number line.  If this is true, then what is the difference of 1 and 0.9999…?  Again, stop and think about that for a bit.

If it’s not zero, then you could just add half of that difference to 0.9999… to get a new number between 0.9999… and 1, which not only contradicts our assumption but also forces us to come up with the decimal representation of such a number.  If it is 0, then you have that $0.9999... = 1$.  And if you try to argue that you can’t subtract 0.9999… from 1, then you need to explain why that operation is not allowed for those two real numbers.  (This second argument is a lot like the first one, really.)  The most reasonable of the three options is that the difference is 0, which means that 0.9999… is actually equal to 1.

Final comments and the standard algebra argument.  Both arguments are reductio ad absurdum arguments; that is, they assume that 0.9999… is not equal to 1 and then reason to a contradiction.  The other arguments that I’ve seen are all direct arguments; i.e., they reason from basic mathematical principles to the conclusion that 0.9999… equals 1.

For example, here’s the standard argument via algebra.  We know that 0.9999… must be equal to some number, so let’s call that number x.  Multiplying by 10 yields $10x = 9.9999...$.  Subtracting the first equation from the second leaves us $9x = 9.0000...$, which implies that $x = 1$.  Thus $0.9999... = 1$.

The algebraic argument is a great one, provided you know algebra.  But for my preteen with a pre-algebra background, these two reductio ad absurdum arguments seemed to be enough to convince him.

## Diversity Statements in Hiring

Recently Abigail Thompson, chair of the mathematics department at the University of California, Davis, and a vice president of the American Mathematical Society, published this article in Notices of the American Mathematical Society.  The article includes the following statement:

Faculty at universities across the country are facing an echo of the loyalty oath [of the 1950s], a mandatory “Diversity Statement” for job applicants.  The professed purpose is to identify candidates who have the skills and experience to advance institutional diversity and equity goals.  In reality it’s a political test, and it’s a political test with teeth.

She goes on to explain why.

Why is it a political test? Politics are a reflection of how you believe society should be organized. Classical liberals aspire to treat every person as a unique individual, not as a representative of their gender or their ethnic group. The sample rubric dictates that in order to get a high diversity score, a candidate must have actively engaged in promoting different identity groups as part of their professional life. The candidate should demonstrate “clear knowledge of, experience with, and interest in dimensions of diversity that result from different identities” and describe “multiple activities in depth.” Requiring candidates to believe that people should be treated differently according to their identity is indeed a political test.

I agree.  The use of diversity statements in hiring is a political test and thus is inherently discriminatory.  We should abandon the practice.

Posted in campus issues, diversity, politics | 2 Comments

## Finding the Area of an Irregular Polygon

Finding the area of an irregular polygon via geometry can be a bit of a chore, as the process depends heavily on the shape of the polygon.  It turns out, however, that there’s a formula that can give you the area of any polygon as long as you know the polygon’s vertices.  That formula is based on Green’s Theorem.

Green’s Theorem is a powerful tool in vector analysis that shows how to convert from a line integral around a closed curve to a double integral over the region enclosed by the curve and vice versa.  More specifically, the circulation-curl form of Green’s Theorem states that, if D is a closed region with boundary curve C and C is oriented counterclockwise, then

$\displaystyle \oint_C (M dx + N dy) = \iint_D \left( \frac{\partial N}{\partial x} - \frac{\partial M}{\partial y} \right) dA$.

The special case of Green’s Theorem that can generate our formula for the area of a polygon has $N = x$ and $M = 0$.  This yields

$\displaystyle \oint_C x dy = \iint_D dA$.

Since the double integral here gives the area of region D, this equation says that we can find the area of D by evaluating the line integral of $x dy$ counterclockwise around the curve C.  The boundary curve for a polygon consists of a finite set of line segments, though, and so to obtain our formula we just need to find out what $\oint_L x dy$ is for a line segment L that runs from a generic point $(x_1, y_1)$ to another generic point $(x_2, y_2)$.  Let’s do that now.

We need a parameterization of L.  That requires a point on the line segment L and a vector in the direction of L.  The starting point $(x_1, y_1)$ and the vector $(x_2 - x_1){\bf i} + (y_2 - y_1){\bf j}$ from the starting point to the ending point work well, giving us the parameterization

$\displaystyle {\bf r}(t) = (x_1 + t(x_2-x_1)){\bf i} + (y_1 + t(y_2-y_1)){\bf j}, \: 0 \leq t \leq 1.$

With $x = x_1 + t(x_2-x_1)$ and $y = y_1 + t(y_2-y_1)$, we have $dy = (y_2 - y_1)dt$.  This gives us the integral

$\displaystyle \int_0^1 (x_1 + t(x_2-x_1))(y_2-y_1)dt \\ = (y_2-y_1)\int_0^1 (x_1 + t(x_2-x_1))dt \\ = (y_2-y_1)\left[tx_1 + \frac{t^2}{2}(x_2-x_1)\right]_0^1 \\ = (y_2-y_1)\frac{x_1+x_2}{2}.$

In other words, the line integral of $x dy$ from $(x_1, y_1)$ to  $(x_2, y_2)$ is $\displaystyle (y_2-y_1)\frac{x_1+x_2}{2},$ the difference in the y coordinates times the average of the x coordinates.

Let’s put all this together to get our formula.  Suppose we have a polygon D with n vertices. Start with any vertex.  Label it $(x_1, y_1)$.  Then move counterclockwise around the polygon D, labeling successive vertices $(x_2, y_2), (x_3, y_3)$, and so forth, until you label the last vertex $(x_n, y_n)$.  Finally, give the starting vertex a second label of $(x_{n+1}, y_{n+1})$.  Then the area of D is given by

$\displaystyle \text{Area of } D = \sum_{i=1}^n (y_{i+1} - y_i) \frac{x_i + x_{i+1}}{2}.$

In words, this means that to find the area of polygon D you can just take the difference of the y coordinates times the average of the x coordinates of the endpoints of each line segment making up the polygon and then add up.

This formula isn’t the only such formula for finding the area of a polygon.  In fact, a more common formula that can also be proved with Green’s Theorem is

$\displaystyle \text{Area of } D= \frac{1}{2} \sum_{i=1}^n (x_i y_{i+1} - x_{i+1}y_i)$.

This second formula looks a little nicer to the eye, but I prefer the first formula for calculations by hand.

Posted in analytic geometry, calculus, Green's Theorem | 1 Comment

## An observation on the unit circle

We did a quick review of the unit circle in my multivariate calculus class last week, and I pointed out a fact about the sines and cosines of the common angles in the first quadrant that some of the students appeared not to have seen before.  I thought I would record it here.

The five angles most commonly encountered in the first quadrant, together with their coordinates on the unit circle, are as follows (angle measures are in radians):

• $0: (1,0)$
• $\pi/6: (\sqrt{3}/2, 1/2)$
• $\pi/4: (\sqrt{2}/2, \sqrt{2}/2)$
• $\pi/3: (1/2, \sqrt{3}/2)$
• $\pi/2: (0, 1)$

Part of the reason these coordinates are important is that they tell you the sine and cosine of the corresponding angle.  The cosine value is the first coordinate (technical term: abscissa), while the sine value is the second coordinate (technical term: ordinate).  For example, $\cos (\pi/3) = 1/2$ and $\sin(\pi/3) = \sqrt{3}/2$.

Here’s another way of writing the same information, one that illustrates a pattern with the coordinate values.

• $0: (\sqrt{4}/2, \sqrt{0}/2)$
• $\pi/6: (\sqrt{3}/2, \sqrt{1}/2)$
• $\pi/4: (\sqrt{2}/2, \sqrt{2}/2)$
• $\pi/3: (\sqrt{1}/2, \sqrt{3}/2)$
• $\pi/2: (\sqrt{0}/2, \sqrt{4}/2)$

Every coordinate value is of the form $\sqrt{i}/2$, where i is a value between 0 and 4.  As the angle measure increases from 0 radians to $\pi/2$ radians, the cosine value decreases from $\sqrt{4}/2$ down to $\sqrt{0}/2$, while the sine value increases from $\sqrt{0}/2$ up to $\sqrt{4}/2$.  Perhaps this pattern will help some folks to memorize the coordinate values more easily.

## Six Proofs of a Binomial Identity

I’m a big fan of proving an identity in multiple ways, as I think each perspective gives additional insight into why the identity is true.  In this post we’ll work through six different proofs of the binomial identity $\displaystyle \sum_{k=0}^n \binom{n}{k} k = n 2^{n-1}$.

1. The absorption identity proof.

The absorption identity states that, for real n, $k \binom{n}{k} = n \binom{n-1}{k-1}$.  Thus we have

$\displaystyle \sum_{k=0}^n \binom{n}{k} k = \sum_{k=0}^n \binom{n-1}{k-1} n = n \sum_{k=-1}^{n-1} \binom{n-1}{k} = n \sum_{k=0}^{n-1} \binom{n-1}{k} = n 2^{n-1}$, recalling that $\binom{n-1}{-1} = 0$.

2. The combinatorial proof.

How many chaired committees of any size can be formed from a group of n people?

One way to count them is to choose the committee first and then choose the chair.  First, we condition on the size of the committee.  If there are k people on the committee, there are $\binom{n}{k}$ ways to choose the people to be on the committee.  Then there are k ways to choose the chair.  Summing up over all possible values of k, we find that the answer is $\sum_{k=0}^n \binom{n}{k} k$.

Another way is to choose the chair first and then choose the committee.  There are n choices for the chair.  Then, for each of the remaining $n-1$ people, we have two options: Each person could be on the committee or not.  Thus there are $2^{n-1}$ ways to choose the committee once the chair is chosen.  This gives an answer of $n 2^{n-1}$ when choosing the chair first.

Since the two answers must be equal, we have $\displaystyle \sum_{k=0}^n \binom{n}{k} k = n 2^{n-1}$.

3. The calculus proof.

Differentiate the binomial theorem, $\displaystyle \sum_{k=0}^n \binom{n}{k} x^k = (x+1)^n$, with respect to x to obtain $\displaystyle \sum_{k=0}^n \binom{n}{k} k x^{k-1} = n(x+1)^{n-1}$.  Letting $x = 1$, we have $\displaystyle \sum_{k=0}^n \binom{n}{k} k = n 2^{n-1}$.

4. The probabilitistic proof.

Imagine an experiment where you flip a fair coin n times.  What is the expected number of heads for this experiment?

One way to determine this starts by conditioning on the number k of heads.  If there are k heads, there are $\binom{n}{k}$ ways of choosing which k flips will be the heads.  The probability that these flips are all heads is $(1/2)^k$, and the probability that the remaining flips are all tails is $(1/2)^{n-k}$.  Multiply these together and apply the definition of expected value to get an answer of $\displaystyle \sum_{k=0}^n \binom{n}{k} k \left(\frac{1}{2} \right)^k \left(\frac{1}{2} \right)^{n-k} = \sum_{k=0}^n \binom{n}{k} k \left( \frac{1}{2} \right)^n = \frac{1}{2^n} \sum_{k=0}^n \binom{n}{k} k$.

Another way is to use indicator variables.  Let $X_k$ be $1$ if flip k is heads and $0$ if flip k is tails.  Then the number of heads in the sequence of n flips is $\sum_{k=0}^n X_k$.  Also, the expected value of $X_k$ is, by definition, $E(X_k) = 1 (1/2) + 0 (1/2) = 1/2$.  Thus the expected number of heads is $E \left( \sum_{k=0}^n X_k \right) = \sum_{k=0}^n E(X_k) = \sum_{k=0}^n (1/2) =n/2$.  (This is a formal way of arguing for the answer of $n/2$ that our intuition says should be the expected number of heads.)

Equating our two answers, we have $\displaystyle \frac{1}{2^n} \sum_{k=0}^n \binom{n}{k} k = \frac{n}{2}$, which implies $\displaystyle \sum_{k=0}^n \binom{n}{k} k = n 2^{n-1}$.

5. The exponential generating functions proof

For this proof we’re going to need a definition and a few properties of exponential generating functions.

First, the binomial convolution of the sequences $(a_n)$ and $(b_n)$ is given by $\displaystyle \sum_{k=0}^n \binom{n}{k} a_k b_{n-k}$.

Second, we have the following.  (See, for example, pages 126-128 of my book The Art of Proving Binomial Identities. [1])

1. If $\displaystyle f_a(x) = \sum_{n=0}^{\infty} a_n \frac{x^n}{n!}$ and $\displaystyle f_b(x) = \sum_{n=0}^{\infty} b_n \frac{x^n}{n!}$ then $\displaystyle f_a(x) f_b(x) = \sum_{n=0}^{\infty} \left(\sum_{k=0}^n \binom{n}{k} a_k b_{n-k} \right) \frac{x^n}{n!}$.  (This is the binomial convolution property for exponential generating functions.)
2. We have $\displaystyle e^x = \sum_{k=0}^{\infty} \frac{x^k}{k!}$.  (This is the Maclaurin series for $e^x$.)
3. If $\displaystyle f_a(x) = \sum_{n=0}^{\infty} a_n \frac{x^n}{n!}$ then $x f_a(x)$ is the exponential generating function for the sequence given by $(n a_{n-1}) = (0, a_0, 2a_1, 3a_2, \ldots)$.

By the definition of binomial convolution, $\sum_{k=0}^n \binom{n}{k} k$ is the binomial convolution of the sequences with $a_n = n$ and $b_n = 1$.  What are the exponential generating functions of these two sequences?

By Property 2, the exponential generating function for the sequence $(b_n)$, with $b_n = 1$, is $e^x$.

If we take the sequence $1, 1, 1, \ldots$, append a $0$ to the front, and multiply it by n, we have the sequence with $a_n = n$.  By Property 3, then, the exponential generating function for the sequence $(a_n)$ is $x e^x$.

Thus, by Property 1, the exponential generating function for $\sum_{k=0}^n \binom{n}{k} k$ is $x e^x (e^x) = x e^{2x}$.  However, $\displaystyle e^{2x} = \sum_{k=0}^{\infty} \frac{(2x)^k}{k!} = \sum_{k=0}^{\infty} \frac{2^k x^k}{k!}$, which means that $e^{2x}$ is the exponential generating function for the sequence $(2^n)$.  Thus, by Property 3, $x e^{2x}$ is also the exponential generating function for the sequence $(n 2^{n-1})$.   Since $\left(\sum_{k=0}^n \binom{n}{k} k \right)$ and $\left(n 2^{n-1}\right)$ have the same generating function, they must be equal, and so $\displaystyle \sum_{k=0}^n \binom{n}{k} k = n 2^{n-1}$.

6. The finite differences bootstrapping proof

This proof requires a result that I don’t think is that well-known.  It’s Theorem 4 in my paper “Combinatorial Sums and Finite Differences.” [2]

Theorem: Let $(a_k)$ and $(b_k)$ be sequences such that $(a_k)$ is the finite difference sequence for $(b_k)$; i.e., $\Delta b_k = b_{k+1} - b_k = a_k$ for each $k \geq 0$.  If $g_n = \sum_{k=0}^n a_k$ and $h_n = \sum_{k=0}^n b_k$ then, for $n \geq 0$, $\displaystyle h_n = 2^n \left( b_0 + \sum_{k=1}^n \frac{g_{k-1}}{2^k} \right).$

First, if $b_k = 1$, then $a_k = 0$.  Thus $g_n = 0$, and, by the theorem, $h_n = \sum_{k=0}^n \binom{n}{k} = 2^n$.

Next, if $b_k = k$, then $a_k = k+1 - k = 1$.  By the theorem and the previous result, then, $\displaystyle \sum_{k=0}^n \binom{n}{k} k = 2^n \left(0 + \sum_{k=1}^n \frac{2^{k-1}}{2^k} \right) = 2^n \sum_{k=1}^n \frac{1}{2} = 2^n \frac{n}{2} = n 2^{n-1}$.

References

1. Spivey, Michael Z.  The Art of Proving Binomial Identities.  CRC Press, 2019.
2. Spivey, Michael Z.  “Combinatorial Sums and Finite Differences.”  Discrete Mathematics, 307 (24): 3130-3146, 2007.