On Constructing Functions, Part 5

Example 5

A sequence of functions {fn:R→R}{fn:R→R} which converges to 0 pointwise but does not converge to 0 in L1L1.

This works because: The sequence tends to 0 pointwise since for a fixed x∈Rx∈R, you can always find N∈NN∈N so that fn(x)=0fn(x)=0 for all nn bigger than NN. (Just choose N>xN>x!)

The details: Let x∈Rx∈R and fix ϵ>0ϵ>0 and choose N∈NN∈N so that N>xN>x. Then whenever n>Nn>N, we have |fn(x)−0|=0<ϵ|fn(x)−0|=0<ϵ.

Of course, fn↛0fn↛0 in L1L1 since∫R|fn|=∫(n,n+1)fn=1⋅λ((n,n+1))=1.

For more such insights, log into www.international-maths-challenge.com.

*Credit for article given to Tai-Danae Bradley*


On Constructing Functions, Part 4

This post is the fourth example in an ongoing list of various sequences of functions which converge to different things in different ways.

Also in this series:

Example 1: converges almost everywhere but not in L1L1
Example 2: converges uniformly but not in L1L1
Example 3: converges in L1L1 but not uniformly
Example 5: converges pointwise but not in L1L1
Example 6: converges in L1L1 but does not converge anywhere

Example 4

A sequence of (Lebesgue) integrable functions fn:R→[0,∞)fn:R→[0,∞) so that {fn}{fn} converges to f:R→[0,∞)f:R→[0,∞) uniformly,  yet ff is not (Lebesgue) integrable.

‍Our first observation is that “ff is not (Lebesgue) integrable” can mean one of two things: either ff is not measurable or ∫f=∞∫f=∞. The latter tends to be easier to think about, so we’ll do just that. Now what function do you know of such that when you “sum it up” you get infinity? How about something that behaves like the divergent geometric series? Say, its continuous cousin f(x)=1xf(x)=1x? That should work since we know∫R1x=∫∞11x=∞.∫R1x=∫1∞1x=∞.Now we need to construct a sequence of integrable functions {fn}{fn} whose uniform limit is 1x1x. Let’s think simple: think of drawring the graph of f(x)f(x) one “integral piece” at a time. In other words, define:

This works because: It makes sense to define the fnfn as  f(x)=1xf(x)=1x “chunk by chunk” since this way the convergence is guaranteed to be uniform. Why? Because how far out we need to go in the sequence so that the difference f(x)−fn(x)f(x)−fn(x) is less than ϵϵ only depends on how small (or large) ϵϵ is. The location of xx doesn’t matter!

Also notice we have to define fn(x)=0fn(x)=0 for all x<1x<1 to avoid the trouble spot ln(0)ln⁡(0) in the integral ∫fn∫fn. This also ensures that the area under each fnfn is finite, guaranteeing integrability.

The details: Each fnfn is integrable since for a fixed nn,∫Rfn=∫n11x=ln(n).∫Rfn=∫1n1x=ln⁡(n).To see fn→ffn→f uniformly, let ϵ>0ϵ>0 and choose NN so that N>1/ϵN>1/ϵ. Let x∈Rx∈R. If x≤1x≤1, any nn will do, so suppose x>1x>1 and let n>Nn>N. If 1<x≤n1<x≤n, then we have |fn(x)−f(x)|=0<ϵ|fn(x)−f(x)|=0<ϵ. And if x>nx>n, then∣∣1xχ[1,∞)(x)−1xχ[1,n](x)∣∣=∣∣1x−0∣∣=1x<1n<1N<ϵ.|1xχ[1,∞)(x)−1xχ[1,n](x)|=|1x−0|=1x<1n<1N<ϵ.

‍For more such insights, log into www.international-maths-challenge.com.

*Credit for article given to Tai-Danae Bradley*

 


On Constructing Functions, Part 3

This post is the third example in an ongoing list of various sequences of functions which converge to different things in different ways.

‍Example 3

A sequence of continuous functions {fn:R→[0,∞)}{fn:R→[0,∞)} which converges to 0 in the L1L1 norm, but does not converge to 0 uniformly.

There are four criteria we want our functions to satisfy:

  1. First off is the uniform convergence. Observe that “{fn}{fn} does not converge to 0 uniformly” can mean one of three things:
  • converges to 0 pointwise only
  • converges to something other than 0 (pointwise or uniformly)
  • does not converge at all

So it’s up to you to decide which one feels more comfortable to work with. Here we’ll choose the second option.

  1. Next, “{fn}{fn} converges to 0 in the L1L1 norm” means that we want to choose our sequence so that the area under the curve of the fnfn gets smaller and smaller as n→∞n→∞.
  2. Further, we also want the fnfn to be positive (the image of each fnfn must be [0,∞)[0,∞)) (notice this allows us to remove the abosolute value sign in the L1L1 norm: ∫|fn|⇒∫fn∫|fn|⇒∫fn)
  3. Lastly, the functions must be continuous.

A slick* but very simple solution is a sequence of triangles of decreasing area with height 1!

This works because: At x=0x=0, fn(x)=1fn(x)=1 for all nn, so there’s no way it can converge to zero (much less uniformly). In fact we have fn→ffn→f pointwise wheref(x)={1,if x=00otherwise.f(x)={1,if x=00otherwise.The area of each triangle is 1n1n which clearly goes to zero for nn large. Also, it’s clear to see visually that the area is getting smaller. This guarantees fn→0fn→0 in the L1L1 norm. Further, each fnfn is positive since we’ve defined it to equal zero as soon as the edges of the triangle reach the xx-axis. And lastly we have piecewise continuity.

The details: Let ϵ>0ϵ>0 and x∈Rx∈R. If x=0x=0, then fn(x)=1fn(x)=1 for all n and so fn→1fn→1. Otherwise x>0x>0 or x<0x<0 If x>0x>0 and x>1x>1, then fn(x)=0fn(x)=0 for all nn. Otherwise if x∈(0,1]x∈(0,1] choose N>1xN>1x. Then whenever n>Nn>N we have fn(x)=1−nx<1−1xx=0<ϵ.fn(x)=1−nx<1−1xx=0<ϵ. The case when x<0x<0 follows a similar argument.

Lastly fn→0fn→0 in the L1L1 norm since, as we mentioned, the areas are decreasing to 0. Explicitly:  ∫R|fn|=∫0−1n1+nx+∫1n01−nx=2n→0.∫R|fn|=∫−1n01+nx+∫01n1−nx=2n→0.

‍*I can brag because this particular example came from a friend. My own attempt at a solution was not nearly as intuitive.

Constructing the Tensor Product of Modules

The Basic Idea

Today we talk tensor products. Specifically this post covers the construction of the tensor product between two modules over a ring. But before jumping in, I think now’s a good time to ask, “What are tensor products good for?” Here’s a simple example where such a question might arise:

Suppose you have a vector space VV over a field FF. For concreteness, let’s consider the case when VV is the set of all 2×22×2 matrices with entries in RR and let F=RF=R. In this case we know what “FF-scalar multiplication” means: if M∈VM∈V is a matrix and c∈Rc∈R, then the new matrix cMcM makes perfect sense. But what if we want to multiply MM by complex scalars too? How can we make sense of something like (3+4i)M(3+4i)M? That’s precisely what the tensor product is for! We need to create a set of elements of the form(complex number) “times” (matrix)(complex number) “times” (matrix)so that the mathematics still makes sense. With a little massaging, this set will turn out to be C⊗RVC⊗RV.

So in general, if FF is  an arbitrary field and VV an FF-vector space, the tensor product answers the question “How can I define scalar multiplication by some larger field which contains FF?” And of course this holds if we replace the word “field” by “ring” and consider the same scenario with modules.

Now this isn’t the only thing tensor products are good for (far from it!), but I think it’s the most intuitive one since it is readily seen from the definition (which is given below).

So with this motivation in mind, let’s go!

‍From English to Math

Let RR be a ring with 1 and let MM be a right RR-module and NN a left RR-module and suppose AA is any abelian group. Our goal is to create an abelian group M⊗RNM⊗RN, called the tensor product of MM and NN, such that if there is an RR-balanced map i:M×N→M⊗RNi:M×N→M⊗RN and any RR-balanced map φ:M×N→Aφ:M×N→A, then there is a unique abelian group homomorphism Φ:M⊗RN→AΦ:M⊗RN→A such that φ=Φ∘iφ=Φ∘i, i.e. so the diagram below commutes.

Notice that the statement above has the same flavor as the universal mapping property of free groups!

Definition: Let XX be a set. A group FF is said to be a free group on XX if there is a function i:X→Fi:X→F such that for any group GG and any set map φ:X→Gφ:X→G, there exists a unique group homomorphism Φ:F→GΦ:F→G such that the following diagram commutes: (i.e. φ=Φ∘iφ=Φ∘i)

set map, so in particular we just want our’s to be RR-balanced:

: Let RR be a ring with 1. Let MM be a right RR-module, NN a left RR-module, and AA an abelian group. A map φ:M×N→Rφ:M×N→R is called RR-balanced if for all m,m1,m2∈Mm,m1,m2∈M, all n,n1,n2∈Nn,n1,n2∈N and all r∈Rr∈R,
φ(m1+m2,n)=φ(m1,n)+φ(m2,n)φ(m1+m2,n)=φ(m1,n)+φ(m2,n)φ(m,n1+n2)=φ(m,n1)+φ(m,n2)φ(m,n1+n2)=φ(m,n1)+φ(m,n2)φ(mr,n)=φ(m,rn)φ(mr,n)=φ(m,rn)

By “replacing” F by a certain quotient group F/HF/H! (We’ll define HH precisely below.)
These observations give us a road map to construct the tensor product. And so we begin:

‍Step 1

Let FF be a free abelian group generated by M×NM×N and let AA be an abelian group. Then by definition (of free groups), if φ:M×N→Aφ:M×N→A is any set map, and M×N↪FM×N↪F by inclusion, then there is a unique abelian group homomorphism Φ:F→AΦ:F→A so that the following diagram commutes.

Step 2

that the inclusion map M×N↪FM×N↪F is not RR-balanced! To fix this, we must “modify” the target space FF by replacing it with the quotient F/HF/H where H≤FH≤F is the subgroup of FF generated by elements of the form

(m1+m2,n)−(m1,n)−(m2,n)(m1+m2,n)−(m1,n)−(m2,n)

  • (m,n1+n2)−(m,n1)−(m,n2)(m,n1+n2)−(m,n1)−(m,n2)
  • (mr,n)−(m,rn)(mr,n)−(m,rn)

where m1,m2,m∈Mm1,m2,m∈M, n1,n2,n∈Nn1,n2,n∈N and r∈Rr∈R. Why elements of this form? Because if we define the map i:M×N→F/Hi:M×N→F/H byi(m,n)=(m,n)+H,i(m,n)=(m,n)+H,we’ll see that ii is indeed RR-balanced! Let’s check:

So, are we done now? Can we really just replace FF with F/HF/H and replace the inclusion map with the map ii, and still retain the existence of a unique homomorphism Φ:F/H→AΦ:F/H→A? No! Of course not. F/HF/H is not a free group generated by M×NM×N, so the diagram below is bogus, right?

Not totally. We haven’t actually disturbed any structure!

How can we relate the pink and blue lines? We’d really like them to be the same. But we’re in luck because they basically are!

‍Step 3

H⊆ker(f)H⊆ker⁡(f), that is as long as f(h)=0f(h)=0 for all h∈Hh∈H. And notice that this condition, f(H)=0f(H)=0, forces ff to be RR-balanced!

Let’s check:

Sooooo… homomorphisms f:F→Af:F→A such that H⊆ker(f)H⊆ker⁡(f) are the same as RR-balanced maps from M×NM×N to AA! (Technically, I should say homomorphisms ff restricted to M×NM×N.) In other words, we have

In conclusion, to say “abelian group homomorphisms from F/HF/H to AA are the same as (isomorphic to) RR-balanced maps from M×NM×N to AA” is the simply the hand-wavy way of saying

Whenever i:M×N→Fi:M×N→F is an RR-balanced map and φ:M×N→Aφ:M×N→A is an RR-balanced map where AA is an abelian group, there exists a unique abelian group homomorphism Φ:F/H→AΦ:F/H→A such that the following diagram commutes:

And this is just want we want! The last step is merely the final touch:

‍Step 4

the abelian quotient group F/HF/H to be the tensor product of MM and NN,

whose elements are cosets,

where m⊗nm⊗n for m∈Mm∈M and n∈Nn∈N is referred to as a simple tensor. And there you have it! The tensor product, constructed.

For more such insights, log into www.international-maths-challenge.com.

*Credit for article given to Tai-Danae Bradley*


On Constructing Functions, Part 2

This post is the second example in an ongoing list of various sequences of functions which converge to different things in different ways.

‍Example 2

A sequence of functions {fn:R→R}{fn:R→R} which converges to 0 uniformly but does not converge to 0 in L1L1.

This works because:  The sequence tends to 0 as n→∞n→∞ since the height of each function tends to 0 and the the region where fnfn is taking on this decreasing height is tending towards all of R+R+ ((0,n)(0,n) as n→∞n→∞) (and it’s already 0 on R−∪{0}R−∪{0}). The convergence is uniform because the number of times we have to keep “squishing” the rectangles until their height is less than ϵϵ does not depend on xx.

The details: Let ϵ>0ϵ>0 and choose N∈NN∈N so that N>1ϵN>1ϵ and let n>Nn>N. Fix x∈Rx∈R.

Case 1 (x≤0x≤0 or x≥nx≥n) Then fn(x)=0fn(x)=0 and so |fn(x)−0|=0<ϵ|fn(x)−0|=0<ϵ.

  • Case 2 (0<x<n0<x<n ) Then fn(x)=1nfn(x)=1n and so |fn(x)−0|=1n<1N<ϵ|fn(x)−0|=1n<1N<ϵ

Finally, fn↛0fn↛0 in L1L1 since∫R|fn|=∫(0,n)1n=1nλ((0,n))=1.∫R|fn|=∫(0,n)1n=1nλ((0,n))=1.

Remark: Here’s a question you could ask: wouldn’t fn=nχ(0,1n)fn=nχ(0,1n) work here too? Both are tending to 0 everywhere and both involve rectangles of area 1. The answer is “kinda.” The problem is that the convergence of nχ(0,1n)nχ(0,1n) is pointwise. BUT Egoroff’s Theorem gives us a way to actually “make” it uniform!.

‍On the notation above:   For a measurable set X⊂RX⊂R, denote the set of all Lebesgue integrable functions f:X→Rf:X→R by L1(X)L1(X). Then a sequence of functions {fn}{fn} is said to converge in L1L1  to a function ff if limn→∞∫|fn−f|=0limn→∞∫|fn−f|=0.

For more such insights, log into www.international-maths-challenge.com.

*Credit for article given to Tai-Danae Bradley*

 


On Constructing Functions, Part 1

Given a sequence of real-valued functions {fn}{fn}, the phrase, “fnfn converges to a function ff” can mean a few things:

  • fnfn converges uniformly
  • fnfn converges pointwise
  • fnfn converges almost everywhere (a.e.)
  • fnfn converges in L1L1 (set of Lebesgue integrable functions)
  • and so on…

Other factors come into play if the fnfn are required to be continuous, defined on a compact set, integrable, etc.. So since I do not have the memory of an elephant (whatever that phrase means…), I’ve decided to keep a list of different sequences that converge (or don’t converge) to different functions in different ways. With each example I’ll also include a little (and hopefully) intuitive explanation for why. Having these sequences close at hand is  especially useful when analysing the behavior of certain functions or constructing counterexamples.

The first sequence we’ll look at is one which converges almost everywhere, but does not converge in L1L1 (the set of Lebesgue integrable functions).

‍Example 1

A sequence of functions {fn:R→R}{fn:R→R} which converges to 0 almost everywhere but does not converge to 0 in L1L1.       

This works because: Recall that to say fn→0fn→0 almost everywhere means fn→0fn→0 pointwise on RR except for a set of measure 0. Here, the set of measure zero is the singleton set {0}{0} (at x=0x=0, fn(x)=nfn(x)=n and we can’t make this less than ϵϵ for any ϵ>0ϵ>0). So fnfn converges to 0 pointwise on (0,1](0,1]. This holds because if x<0x<0 or x>1x>1 then fn(x)=0fn(x)=0 for all nn. Otherwise, if x∈(0,1]x∈(0,1], we can choose nn appropriately:

The details:  Let ϵ>0ϵ>0 and x∈(0,1]x∈(0,1] and choose N∈NN∈N so that N>1xN>1x. Then whenever n>Nn>N, we have n>1xn>1x which implies x>1nx>1n and so fn(x)=0fn(x)=0. Hence |fnx−0|=0<ϵ|fnx−0|=0<ϵ.

Further*, fn↛0fn↛0 in L1L1 since∫R|fn|=∫[0,1n]n=nλ([0,1n])=1.∫R|fn|=∫[0,1n]n=nλ([0,1n])=1.

Remark: Notice that Egoroff’s theorem applies here! We just proved that fn→0fn→0 pointwise a.e. on RR, but Egoroff says that we can actually get uniform convergence a.e. on a bounded subset of RR, say (0,1](0,1].

In particular for each ϵ>0ϵ>0 we are guaranteed the existence of a subset E⊂(0,1]E⊂(0,1] such that fn→0fn→0 uniformly and λ((0,1]∖E)<ϵλ((0,1]∖E)<ϵ. In fact, it should be clear that that subset must be something like (ϵ2,1](ϵ2,1] (the “zero region” in the graph above). Then no matter where xx is in (0,1](0,1], we can always find nn large enough – namely all nn which satisfy 1n<ϵ21n<ϵ2 – so that fn(x)=0fn(x)=0, i.e. fn→ffn→f uniformly. And indeed, λ((0,1]∖(ϵ2,1]=ϵ/2<ϵλ((0,1]∖(ϵ2,1]=ϵ/2<ϵ as claimed.

‍On the notation above:   For a measurable set X⊂RX⊂R, denote the set of all Lebesgue integrable functions f:X→Rf:X→R by L1(X)L1(X). Then a sequence of functions {fn}{fn} is said to converge in L1L1  to a function ff if limn→∞∫|fn−f|=0limn→∞∫|fn−f|=0.

For more such insights, log into www.international-maths-challenge.com.

*Credit for article given to Tai-Danae Bradley*


Generating Random Walks in Mathematics

With connections to the study of gambling, Brownian motion, fractals, and more, random walks are a favourite topic in recreational mathematics.

The diagram above (from Energy transformations during horizontal walking by F. G. Benedict and H. Murschhauser, published in 1915) suggests one method for generating walking data. Creating random walk simulations in Fathom or TinkerPlots is a little more straightforward.

First simulation – using sliders to determine a ‘base angle’

This first example lets you set up random walks where the direction chosen is based on an angle k*2pi/n for a fixed n (whose value is determined by a slider) and a random k (a random integer between 1 and n).

First, create a slider n, then create the attributes below and finally add the data (any number is fine – start with ~500 cases). The formulas below were entered in TinkerPlots, but would work equally well in Fathom.

Plots of (x,y) will show the walk, and plots of (step, distance) will show how the distance from the origin changes over the course of the walk. Different values for n provide walks with their own particular geometries.

The walks start at (0,0) and wander about the plane from there. Re-randomizing (CNTRL-Y) generates new walks.

The simulation gives lots of nice pictures of random walks. You could generate statistics from these by adding measures and measure collections.

One limitation of this simulation is that it is difficult to determine exactly when the walker has returned to the start (0,0).  This turns out to be an interesting question for random walks on the plane (see the wikipedia entry for more on this). Because of the inexactness in the positions calculated using sine and cosine, the walker seems to never return to the origin. There are several ways of dealing with this, but one is to design a simpler simulation that uses exact values – one that sticks to lattice points (xy), where x and y are both integers.

Second simulation – sticking to Integer lattice points

This second simulation can be thought of an ‘urban walker’ where all paths must follow a strictly laid out grid, like some downtown streets. The exactness of the positions means that we can detect with confidence when the walker has crossed back to their starting point. For this simulation, no slider is required – just enter the attributes and add cases.

Using the crossed_start attribute as a filter or to gather measures, you will find that walks often quickly pass over the starting point. You will also find that as you increase the number of cases, the straight ‘etch-a-sketch’ lines of the urban walk take on very interesting fractal-like contours.

For more such insights, log into www.international-maths-challenge.com.

*Credit for article given to dan.mackinnon*

 


Farey definition, property, and algorithm

Here is an outline of how you can go about generating this data. The definition and properties of Farey sequences here are from Hardy & Wright’s An Introduction to the Theory of Numbers (which I am slowly working my way through).
The Farey sequence of order n is the sequence of irreducible fractions between 0 and 1 whose denominator does not exceed n. So, the elements of the sequence are of the form h/k, where h < k < n, and h and k are relatively prime.

The main theorem about Farey numbers provides them with their characteristic property (Theorem 29 in TofN). The characteristic property of Farey sequences is that if h/kh”/k”, and h’/k’ are successive terms in a Farey sequence, then h”/k” is the mediant of h/k and h’/k’. If h/k and h’/k’ are two reduced fractions, their mediant is given by (h+h’)/(k+k’).

It’s nice when a theorem tells you how to implement an algorithm. This property tells us that Farey sequences can be built iteratively or recursively, beginning with F1={0/1 ,1/1}. The algorithm to do this is a nice one – it’s probably not often used as a textbook exercise in recursion because it helps if you to have some data structure or class to represent the fraction, and a way of telling if integers are relatively prime (you can use the Euclidean algorithm to implement a gcd() function).

Here is an outline of how to calculate the next Farey sequence, given that you have one already.

0) input a Farey sequence oldSequence (initial sequence will be {0/1, 1/1})

1) create a new empty sequence newSequence

2) iterate over oldSequence and find out its level by finding the largest denominator that occurs store this in n

3) set n= n+1

4) iterate over oldSequence, looking at each pair of adjacent elements (left and right)

4.1) add left to newSequence
4.2) if the denominators of left and right sum to n, form their mediant
4.2.1) if the numerator and denominator of the mediant are relatively prime, add mediant to newSequence

5) add the last element of oldSequence to newSequence

Note that you only need to add in new elements where the denominators of existing adjacent elements sum to the n value – when this happens you form the mediant of the two adjacent elements. Furthermore, the mediant is only added if the fraction can’t be reduced.

Below is some Java-ish code corresponding to the above – it assumes that the oldSequence and newSequence are an ArrayList and that you have a class Fraction that has fields num (numerator) and den (denominator).

Here are the first five Farey sequences that you get from the algorithm:

The image at the top of the post was generated by implementing the algorithm in Processing, and using the result to draw the associated Ford circles – you could do something similar in regular Java (or other language). If you draw the Ford Circles associated with the sequence, the circle for a fraction “frac” will be centered at (x,y) and have a radius r where

x = (scale)*frac.num/frac.den

y = r

r = (scale)/(2*(frac.den)^2)

where “scale” is some scaling factor (probably in the 100’s) that increases the size of the image.

Here I decided to draw two copies of each circle, one on top of the other.

 

That it contains only fractions between 0 and 1 and that it contains all reduced fractions for denominators n, connects Farey sequences to Euler’s totient function. Euler’s totient function is an arithmetic function that for a given k, counts the integers less than k that are relatively prime to it. This is exactly the number of times that a fraction of the form h/k will appear in the Farey sequence for k>1.

The Farey algorithm, how to draw Ford circles, and the connection to Euler’s totient function are described nicely in J.H. Conway and R.K. Guy’s The Book of Numbers – a great companion to a book like TofN.

 

For more such insights, log into www.international-maths-challenge.com.

*Credit for article given to dan.mackinnon*

 


The Sequence of Primes

As I make my way through Hardy & Wright’s An Introduction to the Theory of Numbers,  I am hoping to work it into my recreational math pursuits – coming up with interesting (but not too heavy) activities that correspond roughly to the material in the text.

The first two chapters are on the sequence of primes. Here’s the activity: obtain a list of primes, import them into Fathom, and construct plots that explore pn and pi(n) and other aspects of the sequence that manifest themselves in the first couple of thousand terms.

In my Fathom experiment, I imported the first 2262 prime numbers.

If you import a sequential list of primes into Fathom (under the attribute prime) and add another attribute n=caseindex, you can create two nice plots. Plot A should have prime as the x axis and n  as the y axis. This shows the function pi(n). To this plot you should add the function x/ln(x) and visually compare the two curves. Plot B should have the x and y axis reversed. On this graph, plotting the function y = x*ln(x) shows how closely this approximation for pn (the nth prime) comes to the actual values.

 

You can add further attributes to look at the distance between primes dist=prime-prev(prime), and also the frequency of twin primes is_twin = (dist=2)or(next(dist)=2).

You can also add attributes to keep a running count of twin_primes, and to keep a running average of the twin_primes. The plot above shows how the ratio of tiwn primes diminishes as the number of primes increases. The plot at the top of the post suggests the distribution of primes and twin primes (in blue) in the numbers up to the 2262nd prime.

For more such insights, log into www.international-maths-challenge.com.

Credit for article given to dan.mackinnon*


The Integral Domain Hierarchy, Part 2

In any area of math, it’s always good idea to keep a few counterexamples in your back pocket. Examples/non-examples from some of the different subsets of integral domains.

Z[i√5]Z[i5] is an integral domain which is not a UFD

That Z[i√5]Z[i5] an integral domain is easy to check (just computation).

  • It’s not a UFD since we can write 6=2⋅3=(1+i√5)(1−i√5)6=2⋅3=(1+i5)(1−i5) as two distinct facorizations into irreducibles*

‍Z[x]Z[x] is a UFD which is not a PID

We know Z[x]Z[x] is a UFD because ZZ is a UFD (recall, a commutative ring RR is a UFD iff R[x]R[x] is a UFD).

  • The ideal (2,x)={2f(x)+xg(x):f(x),g(x)∈Z[x]}(2,x)={2f(x)+xg(x):f(x),g(x)∈Z[x]} (polynomials with even constant term) is not principal**

‍Z[12+i√192]Z[12+i192] is a PID which is not a Euclidean domain

  • This is a PID since it has a Dedekind-Hasse norm (see Dummit and Foote, 3rd ed., §8.2§8.2).
  • It is not a Euclidean domain since it has no universal side divisors (ibid.).

ZZ is a Euclidean domain which is not a field

ZZ is a Euclidean domain via the absolute value norm (which gives the familiar division algorithm).

  • It is not a field since the only elements which are units are 11 and −1−1.

‍  (*) Check 2,3,1+i√52,3,1+i5, and 1−i√51−i5 are indeed irreducible in Z[i√5]Z[i5]:

Write 2=αβ2=αβ for α,β∈Z[i√5]α,β∈Z[i5]. Then α=a+ib√5α=a+ib5 and N(α)=a2+5b2N(α)=a2+5b2 for some integers a,ba,b. Since 4=N(2)=N(α)N(β)4=N(2)=N(α)N(β), we must have a2+5b2=1,2a2+5b2=1,2 or 44. Notice b=0b=0 must be true (since a2+5b2∉{1,2,4}a2+5b2∉{1,2,4} for b≥1b≥1 and for any aa). Hence either α=a=1α=a=1 or 22. If α=1α=1 then αα is a unit. If α=2α=2, then we must have β=1β=1 and so ββ is a unit.

  • Showing 3 is irreducible follows a similar argument.

‍Write 1+i√5=αβ1+i5=αβ with α=a+ib√5α=a+ib5 so that N(α)=a2+5b2∈{1,2,3,6}N(α)=a2+5b2∈{1,2,3,6} since 6=N(α)N(β)6=N(α)N(β). Consider two cases:  (case 1) If b=0b=0, then a2∈{1,2,3,6}a2∈{1,2,3,6} which is only true if a2=1a2=1 and so α=a=±1α=a=±1 is a unit. (case 2) If b>0b>0, we can only have b2=1b2=1 (since b2>1b2>1 gives a contradiction), and so a2+5∈{1,2,3,6}a2+5∈{1,2,3,6}, which implies a2=1a2=1. Hence α=±1±i√5α=±1±i5 and so N(α)=6N(α)=6. This implies N(β)=1N(β)=1 and so β=±1β=±1, which is a unit.

‍Showing 1−i√51−i5 is irreducible follows a similar argument.

principal in Z[x]Z[x]:

  • Suppose to the contrary (2,x)=(f(x))(2,x)=(f(x)) for some polynomial f(x)∈Z[x]f(x)∈Z[x]. Since 2∈(f(x))2∈(f(x)), we must have 2=f(x)p(x)2=f(x)p(x) for some p(x)∈Z[x]p(x)∈Z[x]. Hence 0=degf(x)+degp(x)0=deg⁡f(x)+deg⁡p(x) which implies both f(x)f(x) and p(x)p(x) are constants. In particular, since 2=±1⋅±22=±1⋅±2, we need f(x),p(x)∈{±1,±2}f(x),p(x)∈{±1,±2}. If f(x)=±1f(x)=±1, then (f(x))=Z[x](f(x))=Z[x] which is a contradiction since (f(x))=(2,x)(f(x))=(2,x) mustbe a proper ideal (not every polynomial over Z[x]Z[x] has even constant term). It follows that f(x)=±2f(x)=±2. But since x∈(f(x))x∈(f(x)) as well, x=2r(x)x=2r(x) for some r(x)∈Z[x]r(x)∈Z[x]. But of course this is impossible for any polynomial with integer coefficients, r(x)r(x). Thus (2,x)(2,x) is not principal.

For more such insights, log into www.international-maths-challenge.com.

*Credit for article given to Tai-Danae Bradley*

 


The Integral Domain Hierarchy, Part 1

Here is a list of some of the subsets of integral domains, along with the reasoning (a.k.a proofs) of why the bullseye below looks the way it does. Part 2 of this post will include back-pocket examples/non-examples of each.

Integral Domain: a commutative ring with 1 where the product of any two nonzero elements is always nonzero

Unique Factorization Domain (UFD): an integral domain where every nonzero element (which is not a unit) has a unique factorization into irreducibles

Principal Ideal Domain (PID): an integral domain where every ideal is generated by exactly one element

Euclidean Domain: an integral domain RR with a norm NN and a division algorithm (i.e. there is a norm NN so that for every a,b∈Ra,b∈R with b≠0b≠0, there are q,r∈Rq,r∈R so that a=bq+ra=bq+r with r=0r=0 or N(r)<N(b)N(r)<N(b))

Field: a commutative ring where every nonzero element has an inverse

Because… We can just choose the zero norm: N(r)=0N(r)=0 for all r∈Fr∈F.

Proof: Let FF be a field and define a norm NN so that N(r)=0N(r)=0 for all r∈Fr∈F. Then for any a,b∈Fa,b∈F with b≠0b≠0, we can writea=b(b−1a)+0.a=b(b−1a)+0.

Because… If I◃RI◃R is an arbitrary nonzero ideal in the Euclidean domain RR, then I=(d)I=(d), where d∈Id∈I such that dd has the smallest norm among all elements in II. Prove this using the division algorithm on dd and some a∈Ia∈I.

Proof: Let RR be a Euclidean domain with respect to the norm NN and let I◃RI◃R be an ideal. If I=(0)I=(0), then II is principle. Otherwise let d∈Id∈I be a nonzero element such that dd has the smallest norm among all elements in II. We claim I=(d)I=(d). That (d)⊂I(d)⊂I is clear so let a∈Ia∈I. Then by the division algorithm, there exist q,r∈Rq,r∈R so that a=dq+ra=dq+r with r=0r=0 or N(r)<N(d)N(r)<N(d). Then r=a−dq∈Ir=a−dq∈I since a,d∈Ia,d∈I. But my minimality of dd, this implies r=0r=0. Hence a=dq∈(d)a=dq∈(d) and so I⊂(d)I⊂(d).

Because…Every PID has the ascending chain condition (acc) on its ideals!* So to prove PID ⇒⇒ UFD, just recall that an integral domain RR is a UFD if and only if 1) it has the acc on principal ideals** and 2) every irreducible element is also prime.

Proof: Let RR be a PID. Then 1) RR has the ascending chain condition on principal ideals and 2) every irreducible element is also a prime element. Hence RR is a UFD.

Because… By definition.

Proof: By definition.

‍*Def: In general, an integral domain RR has the acc on its principal ideals if these two equivalent conditions are satisfied:

  1. Every sequence I1⊂I2⊂⋯⊂⋯I1⊂I2⊂⋯⊂⋯ of principal ideals is stationary (i.e. there is an integer n0≥1n0≥1 such that In=In0In=In0 for all n≥n0n≥n0).
  2. For every nonempty subset X⊂RX⊂R, there is an element m∈Xm∈X such that whenever a∈Xa∈X and (m)⊂(a)(m)⊂(a), then (m)=(a)(m)=(a).

**To see this, use part 1 of the definition above. If I1⊂I2⊂⋯I1⊂I2⊂⋯ is an acsending chain, consider their union I=⋃∞n=1InI=⋃n=1∞In. That guy must be a principal ideal (check!), say I=(m)I=(m). This implies that mm must live in some In0In0  for some n0≥1n0≥1 and so I=(m)⊂In0I=(m)⊂In0. But since II is the union, we have for all n≥n0n≥n0(m)=I⊃In⊃In0=(m).(m)=I⊃In⊃In0=(m).Voila!

Every field FF is a PID

because the only ideals in a field are (0)(0) and F=(1)F=(1)! And every field is vacuously a UFD since all elements are units. (Recall, RR is a UFD if every non-zero, non-invertible element (an element which is not a unit) has a unique factorzation into irreducibles).

In an integral domain, every maximal ideal is also a prime ideal. 

(Proof: Let RR be an integral domain and M◃RM◃R a maximal ideal. Then R/MR/M is a field and hence an integral domain, which implies M◃RM◃R is a prime ideal.)

Butut the converse is not true (see counterexample below). However, the converse is true in a PID because of the added structure!

(Proof: Let RR be a PID and (p)◃R(p)◃R a prime ideal for some p∈Rp∈R. Then pp is a prime – and hence an irreducible – element (prime ⇔⇔ irreducible in PIDs). Since in an integral domain a principal ideal is maximal whenever it is generated by an irreducible element, we conclude (p)(p) is maximal.)

This suggests that if you want to find a counterexample – an integral domain with a prime ideal which is not maximal – try to think of a ring which is not a PID:   In Z[x]Z[x], consider the ideal (p)(p) for a prime integer pp. Then (p)(p) is a prime ideal, yet it is not maximal since(p)⊂(p,x)⊂Z[x].(p)⊂(p,x)⊂Z[x].

If FF is a field, then F[x]F[x] – the ring of polynomials in xx with coefficients in FF – is a Euclidean domain with the norm N(p(x))=degp(x)N(p(x))=deg⁡p(x) where p(x)∈F[x]p(x)∈F[x].

By the integral domain hierarchy above, this implies every ideal in F[x]F[x] is of the form (p(x))(p(x)) (i.e. F[x]F[x] is a PID) and every polynomial can be factored uniquely into a product of prime polynomials (just like the integers)! The next bullet gives an “almost converse” statement.

If R[x]R[x] is a PID, the RR must be a field.

To see this, simply observe that R⊂R[x]R⊂R[x] and so RR must be an integral domain (since a subset of a integral domain inherets commutativity and the “no zero divisors” property). Since R[x]/(x)≅RR[x]/(x)≅R, it follows that R[x]/(x)R[x]/(x) is also an integral domain. This proves that (x)(x) is a prime ideal. But prime implies maximal in a PID! So R[x]/(x)R[x]/(x) – and therefore RR – is actually a field.

  • This is how we know, for example, that Z[x]Z[x] is not a PID (in the counterexample a few bullets up) – ZZ is not a field!

‍For more such insights, log into www.international-maths-challenge.com.

*Credit for article given to Tai-Danae Bradley*