Let $\Omega\subset\mathbb{R}^2$ be a bounded convex domain, $f$ be a positive smooth function on $\Omega$ and $\phi:\partial\Omega\rightarrow\mathbb{R}$ be a continuous function. It is known that as long as $f$ does not tend to $+\infty$ too fast at the boundary, for every $t>0$ there is a unique convex function $u_t\in C^0(\overline{\Omega})\cap C^\infty(\Omega)$ satisfying $$ \det D^2 u_t=t\,f,\quad u_t|_{\partial\Omega}=\phi. $$

**Question.** Does $u_t$ depend smoothly on $t$?

In an interview (I link the Google translation), Voevodsky talks about how, in the late 2000s, he worked on the problem of "restoring the history of populations according to their modern genetic composition". Some of his unpublished papers on this topic are now available online. For example, a paper titled "Singletons" is available on the IAS website. Why did Voevodsky abandon the subject of this rather fleshed-out paper so suddenly?

My question is whether the functor F:Grp->Set that sends a Group to its Set of elements with finite order is representable. I have the sense that it shouldn't be but I've so far failed to prove it in any way.

Let $U$ be a subspace of the finite dimensional vector space $V$ over a field $\mathbb{k}$. Let $B_V$ and $B_U$ be fixed bases for $V$ and $U$ respectively. Let $u \in U$ and let's give ourselves $[u]_V$, the vector representing $u$ with respect to $B_V$.

**How do we effectively compute $[u]_U$ when $\mathbb{k}$ is a finite field, say the one with two elements?**

For a fiber bundle $M\longrightarrow N$ where $\dim N=n$, a non-degenerate 1-form $\theta$ on $M$ generates the differential ideal $\mathcal{I}$, and the Lagrangian $\mathcal{L}$ is an $n$-form on $M$. All these data form a non-degenerate variational problem.

In Griffiths' book *Exterior Differential Systems and the Calculus of Variations* page 84, he writes that

For non-degenerate variational problems the rank of $\theta$ is everywhere the maximum possible value $n$.

Then the question comes: Suppose there's another non-degenerate variational problem whose fiber bundle $W\longrightarrow N$ fibers on the same base $N$. Let $\varphi$ be the non-degenerate 1-form on $W$, then it's rank is restricted to be $n$.

If there is a smooth map $f:M\longrightarrow W$, we can induce a pullback $f^*:\varphi\longrightarrow\theta$. We know 1-form is a section of cotangent space (which is a vector space). A constant rank linear transformation between two vector space is invertible. Does the same rank $n$ of $\varphi$ and $\theta$ mean that the $f^*$ is invertible?

In Quantum Field Theory and Jones Polynomial (equation 2.16), Witten used a formula relating the APS eta-invariant to the Chern-Simons action. Witten claimed that it is derived from the Atiyah-Patodi-Singer index theorem. I cannot find any clue how one can derive that formula from APS index theorem.

In the following, I will briefly explain equation 2.19 in that paper, and will write the Atiyah-Patodi-Singer index theorem. Please help me figure out how one can derive that equation from APS index theorem.

Let $Y$ be a closed three dimensional manifold. Let $G$ be a compact simple gauge group, whose Lie algebra is denoted by $\mathfrak{g}$. Let $E$ be a $G$-bundle over $Y$, with connection $1$-form $A\in\Omega^{1}(Y,\mathfrak{g})$. The Chern-Simons action is given by

$$I[A]=\frac{1}{4\pi}\int_{Y}\mathrm{Tr}\left(A\wedge dA+\frac{2}{3}A\wedge A\wedge A\right).$$

Let $D_{A}$ be the covariant derivative. Then, one is interested in the operator

$$L=\bigg( \begin{matrix} \ast D_{A}&-D_{A}\ast\\D_{A}\ast&0 \end{matrix} \bigg).$$

To be specific, one is interested in $L_{-}$, the restriction of $L$ on odd forms, i.e.

$$L_{-}=L|_{\Omega^{1}(Y)\oplus\Omega^{3}(Y)}.$$

One defines its eta-invariant

$$\eta_{L_{-}}(A)=\lim_{s\rightarrow 0}\sum_{j}\frac{\mathrm{sign}(v_{j})}{|v_{j}|^{s}}$$

where $v_{j}$ are non-zero eigenvalues of the operator $L_{-}$.

Similarly, one defines the eta-invariant for the trivial gauge $A=0$, denoted by $\eta_{L_{-}}(0)$. With this trivial gauge, one is interested in the operator

$$L=\bigg( \begin{matrix} \ast d&-d\ast\\d\ast&0 \end{matrix} \bigg),$$

restricted on odd forms.

*Equation 2.16 in Quantum Field Theory and Jones Polynomial:*

$$\frac{1}{4}\left(\eta_{L_{-}}(A)-\eta_{L_{-}}(0)\right)=\frac{c_{2}(G)}{2\pi}I[A]$$

Witten claimed that this is a result from Atiyah-Patodi-Singer index theorem.

The original statement of APS index theorem is from Spectral Asymmetry and Riemannian Geometry I, which is very hard to read for physics students. In the following, I copy the statement of APS index theorem from Aspects of Boundary Problems in Analysis and Geometry, which is easier to read.

Let the closed three dimensional manifold $Y$ be the boundary of a compact, oriented, four dimensional Riemannian manifold $M$, i,e, $\partial M=Y$. Let $S(M)$ be a spin bundle over $M$, then one has the splitting $S(M)=S^{+}(M)\oplus S^{-}(M)$ into chiral halves. Let $E$ be a Hermitian vector bundles over $M$. The twisted Dirac operator on $M$ is defined as

$$\mathcal{D}=\bigg( \begin{matrix} 0&D^{+}\\D^{-}&0 \end{matrix} \bigg),$$

with

$$D^{+}:\Gamma(M,S^{+}(M)\otimes E)\rightarrow\Gamma(M,S^{-}(M)\otimes E)$$

$$D^{-}:\Gamma(M,S^{-}(M)\otimes E)\rightarrow\Gamma(M,S^{+}(M)\otimes E)$$

where $\Gamma(M,S^{\pm}(M)\otimes E)$ is the set of sections of the bundle $S^{\pm}(M)\otimes E$.

In addition, one assumes the following conditions:

$M$ has a collar neighborhood $N=[0,1)_{s}\times Y$ near $Y$, where the metric is a product

$$g=ds^{2}+h$$

with $h$ a metric on $Y$.

Denote the space of square-integrable spinors on $Y$ by $L^{2}(Y,S(Y))$. Near the boundary $Y$, the Dirac operator $\mathcal{D}$ is of product type on the collar of the following form:

$$\mathcal{D}=\Gamma^{s}(\partial_{s}+D_{Y})$$

where $\Gamma^{s}:S^{\pm}(N)\otimes E|_{N}\rightarrow S^{\mp}(N)\otimes E|_{N}$ is unitary mapping

$$L^{2}(Y,S(Y)\otimes E|_{Y})\rightarrow L^{2}(Y,S(Y)\otimes E|_{Y})$$

of spinors on $Y$, and

$$D_{Y}:\Gamma(Y,S(Y)\otimes E|_{Y})\rightarrow\Gamma(Y,S(Y)\otimes E|_{Y})$$

is the self-adjoint Dirac operator on $Y$.

Denote the APS eta-invariant for $D_{Y}$ by

$$\eta_{D_{Y}}(s)=\sum_{\lambda\in\mathrm{spec}(D_{Y})\backslash\left\{0\right\}}m_{\lambda}\frac{\mathrm{sign}(\lambda)}{|\lambda|^{s}}$$

where $m_{\lambda}$ is the multiplicity of the eigenvalue $\lambda$.

Let $\widehat{M}$ be the non-compact elongation of $M$ defined as follows:

$$\widehat{M}=(-\infty,0]_{s}\times Y\cup_{\partial M}M$$

One denotes the extension of $D_{M}$ on $\widehat{M}$ by $\widehat{D}$.

*Atiyah-Padoti-Singer:*

$$\mathrm{ind}(\widehat{D})=\int_{M}\hat{A}(TM)\mathrm{ch}(E)-\frac{1}{2}\left(\eta(D_{Y})+\dim\ker D_{Y}\right)$$

$$=\frac{-1}{8\pi^{2}}\int_{M}\mathrm{Tr}\left(F\wedge F\right)+\frac{\dim E}{192\pi^{2}}\int_{M}\mathrm{Tr}\left(R\wedge R\right)-\frac{1}{2}\left(\eta(D_{Y})+\dim\ker D_{Y}\right)$$

Please tell me how I can derive Witten's formula (2.16) from the above APS index formula. The quadratic Casimir $c_{2}(G)$ is suppposed to come from replacing the trace in the adjoint representation by trace in the fundamental representation in the Chern-Simons action. However, from the APS index formula, I cannot see anything in the adjoint representation.

I've seen some "physical" derivations of the formula (2,16) from Gauge Dependence of the Eta-Function in Chern-Simons Field Theory and the Vilkovisky-DeWitt Correction, and Perturbative Expansion of Chern-Simons Theory with Noncompact Gauge Group, but they made me even more confused.

What is even worse, I found a more genetic formula generalizing Witten's formula (2.16) from Lectures on Quantization of Gauge Systems (equation 60 iin page 53) and Computer Calculation of Witten's 3-Manifold Invariant (equation 1.31 in page 86).

Also, in this physics paper Global Symmetries, Counterterms, and Duality in Chern-Simons Matter Theories with Orthogonal Gauge Groups (equation 4.2 in page 33), the APS index theorem looks very different from the original one, with a factor of the quadratic Casimir.

Please tell me where this second order Casimir is coming from.

My question is about artificial intelligence, specifically neural networks.Hi I'm wondering what search method does back propagation use and its search space. I can't find resources that state what it is.

As a physicist who knows (something) about General Relativity, I've costumed to the term "maximally symmetric space"... being an $n$-dimensional manifold with $\frac{n(n+1)}{2}$ Killing vectors. A demonstration of that can be found in Weinberg's book "Gravitation and Cosmology", but it is always assumed that the manifold is Riemannian, and the connection is the Levi-Civita connection.

**Question(s)**: Is the above statement true for affine manifolds? Can you recommend me some bibliography on the subject?

*P.D.*: I'm interested in considering connections with torsion and non-metricity.

From a planar graph $\Gamma$, equipped with an integer-valued weight function $d:E(\Gamma) \sqcup V(\Gamma) \to \mathbb{Z}$, one can build a $3$-manifold $M_{\Gamma}$ as follows. For each vertex $v$, draw a small planar unknot centered at $v$. For each edge $e$ connecting vertices $v$ and $w$, add a series of $d(e)$ clasps between the corresponding unknots (with positive weights represented by right-handed clasps and negative weights represented by left-handed clasps). The result is a link with unknotted components - call it $L_{\Gamma}$. To get the $3$-manifold $M_{\Gamma}$, perform Dehn surgery on each component of $L_{\Gamma}$, with framings $$ f(v) = d(v) + \sum_{e \ni v} d(e). $$

In fact, *every 3-manifold $M$ is diffeomorphic to a manifold of the form $M_{\Gamma}$*. This can be proved using ideas from a paper of Matveev and Polyak (*A Geometrical Presentation of the Surface Mapping Class Group and Surgery*), as follows. Choose a Heegaard splitting for $M$, and write the gluing map as a composition of Lickorish twists. Then use the graphical calculus in sections 3 and 4 of that paper (omitting the twists denoted $\epsilon_i$) to produce a tangle whose plat closure is a framed link of the form $L_{\Gamma}$, such that the result of surgery on this link is $M$. This argument is given by Polyak in slides available on his website.

There is another proof as well, which involves taking an arbitrary link surgery presentation and repeatedly simplifying it. There is a natural way to measure the complexity of a link diagram, such that links of minimal complexity are of the form $L_{\Gamma}$. It is always possible to reduce this complexity by adding cancelling unknotted components and doing handleslides.

One can take the idea further and show that there is a finite set of local moves which suffice to relate any two weighted planar graphs representing the same 3-manifold. Indeed, this follows abstractly from the fact that the mapping class group is finitely presented. However, the moves produced by Wajnryb's presentation (for example) are rather complicated and nasty-looking. It is therefore natural to ask if one can find a more appealing set of moves.

One must certainly include the following moves (please excuse the lack of pictures):

- Self-loops and edges with weight $0$ can be eliminated.
- Any two parallel edges can be combined, at the expense of adding their weights.
- Suppose that $v$ is a vertex incident to exactly one edge $e$, with $d(e) = \pm f(v) = \pm 1$. Then $v$ and $e$ can be deleted, at the expensing of changing the framing on the other endpoint of $e$, call it $w$. If $f(v) = 0$, then $w$ (and all of its incident edges) can be removed from the graph .
- Suppose that $v$ is a vertex incident to exactly two edges $e_1$ and $e_2$, with $d(e_1) = d(e_2) = - f(v) = \pm 1$, then the vertex $v$ can be replaced with a single edge joining the opposite endpoints of $e_1$ and $e_2$, call them $w_1$ and $w_2$. If $d(e_1) = - d(e_2)$ and $f(v) = 0$, then $e_1$ and $e_2$ can be contracted, with the resulting vertex having weight $d(w_1) + d(w_2)$.
- Suppose that $v$ is a vertex incident to exactly three edges, and suppose that $d(e_1) = d(e_2) = -d(e_3) = f(v)$. Then $v$, together with all $e_i$, can be eliminated at the expense of adding a triangle connecting the opposite endpoints of the edges $e_i$.

Let's call any of the above moves a "blowdown", and let's call their inverses "blowups". All of them can be easily deduced using Kirby calculus, or from relations in the mapping class group.

**Question 1**: Are blowups and blowdowns sufficient to relate
any two planar graph presentations of a given $3$-manifold?

If the answer to this question is no, then there are additional non-local moves to consider:

- If two edges $e_1$ and $e_2$ connect the same pair of vertices (but are not necessarily parallel), then they can be combined, at the expense of adding their weights.
- If there is a vertex $v$ which divides $\Gamma$ into multiple components, those components can be "permuted around $v$". Any of these components which is connected to $v$ by a single edge $e$ with weight $\pm 1$ can also be "flipped over", at the expense of changing the weight on $e$.
- If there are two vertices $v$ and $w$ which separate the graph into multiple components, then any component which is joined to both $v$ and $w$ by a single pair of edges with opposite weights in $\{\pm 1\}$ can be "flipped over", at the expense of changing the weights on the edges.

Let's call any of the above moves a ``mutation''.

**Question 2**: Are blowups, blowdowns, and mutations sufficient to relate any two
planar graph presentations of a given $3$-manifold?

If the answer to this question is also no, then it would be good to have a nice answer to the following (admittedly vague) question:

**Question 3**: What is the "simplest possible" set of moves which are sufficient to relate any two planar graph presentations of a given $3$-manifold?

One argument in favor of a positive answer to Questions 1 or 2, or at least a very nice answer to Question 3, is that Kirby calculus itself admits a finite set of simple local moves. One approach might be to find a *canonical* way of simplifying an arbitrary link surgery presentation to a planar graph presentation, and trace the effects of a Kirby move through the simplification process to see which graph moves are required to implement it.

There is also a relationship with double branched covers, which might be relevant. If all vertex weights $d(v) = 0$, then $M_{\Gamma}$ can be identified (after doing surgery on an essential 2-sphere) with the double cover branched over a link $Z \subset S^3$, whose "checkerboard graph" is $\Gamma$. In this picture, Reidemeister moves on $Z$ can be realized by blowdowns and blowups, and Conway mutations can be realized by graph mutations. Note that these moves do not suffice to relate links with diffeomorphic double branched covers - this might be viewed as evidence in favor of a negative answer to Questions 1 and 2.

Let $X$ be the stack of rank $1$ degree $b$ coherent sheaves $E$ with torsion of length at most 1 on an elliptic curve $C$. Let $Y$ be the stack of pairs $E^{'} \subset E$ such that $E \in X$ and $E/E^{'}\cong \mathcal O_x$ for a point $x \in C$. Let $X_1$ be the stack of rank $1$ degree $b$ coherent sheaves whose torsion has length exactly $1$. Question: is it true (and if so, how to see) that $Y$ is the blow-up of $X \times C$ along $X_1$, where $X_1 \subset X \times C$ by $E \mapsto (E, \rm{Torsion}(E))$?

Let $n$ be a positive integer (assume $n$ is prime for simplicity), and let $x_k = \pm1$, for $k = 0,1,2,..., n-1$. Let $\rho$ be an $n-$th root of unity, I am interested in lower bounds for the absolute value of sums of the form:

$$S_n = \sum_{k=0}^{n-1} x_k \rho^k$$ assuming that such sum is not equal to zero (when $n$ is prime this can only happen when all the $x_k$'s have the same sign).

One can obtain an easy lower bound of $\frac{1}{n^{n-1}}$ by multiplying this algebraic integer by all its Galois conjugates, but given that there are $2^n$ sums, I am expecting a better lower bound (hopefully $e^{-Cn}$ for some constant $C$), or maybe is there anything known about the probability $$Pr(|S_n| < e^{-100n})$$ I am expecting this quantity to be exponentially small, I think from an argument in Tao-Vu's paper (https://arxiv.org/abs/1307.4357) related to Nguyen-Vu's optimal Offord-Littlewood inverse Theorem one might be able to show that such probability is smaller than $n^{-C}$ (for any fixed $C$ and $n \to \infty$).

I would be grateful for any information related to sums of this form, similar sums or some understanding in how difficult this question can be.

Thanks!

I would like to construct an atlas for the stack of sheaves E of rank 1 and degree b on an elliptic curve C such that E has torsion of length at most 1. Am I allowed to fix both the determinant L of the sheaf E and the point x on C where E possibly has torsion? In that case I would construct an atlas as PHom(L(-2x),F) where F is the unique nontrivial extension of L(-x) by L(-x). Would this help me to construct an atlas for the original stack? If not, could you suggest a different way to construct an atlas for the original stack?

I came across the following inequality in one of my calculations ($X,Y$ are centered random variables):

$$\operatorname{E}(X^2Y^2)-\operatorname{E}(X^2)\operatorname{E}(Y^2) \geq 2 \operatorname{E}(XY)^2$$

or, written in terms of covariances,

$$\operatorname{Cov}(X^2,Y^2) \geq 2 \operatorname{Cov}(X,Y)^2$$.

If $(X,Y)=(U,V)$ is a two-dimensional centered Gaussian, this becomes an equality and if $(X,Y)=(H_p(U),H_q(V))$, where $(U,V)$ is still a two-dimensional centered Gaussian with $\operatorname{E}(U^2)=\operatorname{E}(V^2)=1$ and $H_k$ denotes the $k$th (probabilists') Hermite polynomial, the inequality above is strict whenever $p,q \geq 2$.

I have a feeling that something like this should be true for arbitrary random variables but couldn't prove it. Does this inequality look familiar to anybody or do you have an idea on how this could be proved/disproved?

Let $X$ be a nonsingular threefold of degree $d$ contained in $\mathbb{P}^4$ and let $P$ be a point on $X$. Is it possible to find a plane $H$ contained in $T_P (X)$ such that $X \cap H$ is not a union of $d$ lines?

Let $G$ be a simple graph with $n$ vertices and $\lambda$ be the largest eigenvalue of its Laplacian operator $L=D-A$. I have some evidence for the following conjecture:

**Conjecture**: If G has diameter $\delta>3$ then $\lambda\leq n-1$.

I need a proof or a counterexample for this conjecture. Does there exist a good general upper-bound for $\lambda$ in terms of $\delta$ which includes the above conjecture (if it is true)?

This is a philosophy:

**step one**: A theorem: suppose $r:\Bbb N\to (0,1)$ is a function given by $r(n)$ is obtained by putting a point at the beginning of $n$ instance $r(34880)=0.34880$ then $r(\Bbb P)$ is dense in the interval $[0.1,1]$. proof

**step two**: a topological space based on the theorem above on odd numbers is made as: Let $Z_1:=\{\pm(2n-1)\mid n\in\Bbb N\}\cup\{0\}$ and $\lt_1$ be a total order relation (not well ordering) on $Z_1$ with: $\begin{cases} \forall m,n\in\Bbb N\\ 2n-1\lt_12m-1 & \text{iff}\quad r(2n-1)\lt r(2m-1),\\ -2n+1\lt_1-2m+1 & \text{iff}\quad r(2n-1)\gt r(2m-1),\\ -2n+1\lt_10\lt_12m-1\end{cases}$

then assume $\mathfrak T$ is a topology on $Z_1$ induced by $\lt_1$ hence $(Z_1,\mathfrak T)$ is a Hausdorff space.

**step three**: importance of prime number theorem: prime number theorem discusses on all prime numbers simultaneously & also contains the limitation concept & the logarithm function.

**step four**: amplifying the theory via algebraic structures: $(Z_1,\star,\circ)$ is an unique factorization domain with: $\begin{cases} \forall m,n\in\Bbb N,\,\forall v\in Z_1,\quad e=0\\ (2m-1)\star(-2m+1)=0\\ (4m-3)\star(4n-3)=4m+4n-5\\ (4m-3)\star(-4n+3)=\begin{cases} 4m-4n+1 & m\lt n\\ 4m-4n-1 & m\gt n\end{cases}\\ (4m-3)\star(4n-1)=4m+4n-3\\ (4m-3)\star(-4n+1)=\begin{cases} 4m-4n-1 & m\le n\\ 4m-4n-3 & m\gt n\end{cases}\\ (-4m+3)\star(-4n+3)=-4m-4n+5\\ (-4m+3)\star(4n-1)=\begin{cases} 4n-4m+1 & m\le n\\ 4n-4m+3 & m\gt n\end{cases}\\ (-4m+3)\star(-4n+1)=-4m-4n+3\\ (4m-1)\star(4n-1)=4m+4n-1\\ (4m-1)\star(-4n+1)=\begin{cases} 4m-4n+1 & m\lt n\\ 4m-4n-1 & m\gt n\end{cases}\\ (-4m+1)\star(-4n+1)=-4m-4n+1\\ 0\circ v=0,\quad1\circ v=v,\quad((-1)\circ v)\star v=0\\ (4m-3)\circ(4n-3)=8mn-4m-4n+1\\ (4m-3)\circ(-4n+3)=-8mn+4m+4n-1\\ (4m-3)\circ(4n-1)=8mn-4n-1\\ (4m-3)\circ(-4n+1)=-8mn+4n+1\\ (-4m+3)\circ(-4n+3)=8mn-4m-4n+1\\ (-4m+3)\circ(4n-1)=-8mn+4n+1\\ (-4m+3)\circ(-4n+1)=8mn-4n-1\\ (4m-1)\circ(4n-1)=8mn-1\\ (4m-1)\circ(-4n+1)=-8mn+1\\ (-4m+1)\circ(-4n+1)=8mn-1\end{cases}$

**step five**: rewriting the algebraic equation of **Goldbach** (in my individual point of view Goldbach is equivalent to the induction axiom, anyhow) in accordance with the UFD:

Guess $1$: $\forall n\in\Bbb N,\,n$ is a prime iff $2n-1$ is an irreducible element in $(Z_1,\star,\circ)$ and we have: $\begin{cases} \forall m,n,r,s\in\Bbb N,\\\\m\pm n=r\qquad\text{iff}\quad(2m-1)\star(\pm(2n-1))=2r-1\\\\m\cdot n=s\qquad\text{iff}\quad(2m-1)\circ(2n-1)=2s-1\quad\text{iff}\\2s-1=(2m-1)\star(2m-1)\star...(2m-1)\,(n\text{ times is summed})\end{cases}$.

Irreducible elements in $(Z_1,\star,\circ)$ except $3$ are of the form $4k-3,\,k\in\Bbb N$.

Guess $2$: $Y:=\{2p-1\mid p\in\Bbb P\setminus\{2\}\}$ is dense in $N_1:=\{n\in Z_1\mid n\gt0\}$.

**Goldbach's conjecture**: $\forall n\in\Bbb N,\,\exists r,s\in\Bbb N$, such that $4n+7=(4r-3)\star(4s-3),$ & $4r-3,4s-3$ are irreducible elements greater than $3$ in $(Z_1,\star,\circ)$, meantime $2r-1,2s-1\in\Bbb P$ & $4n+7$ is of the form $4k-1,\,k\in\Bbb N$.

$\color{red}{\text{step six}}$: analytic number theory: each open set in the topological space $(Z_1,\mathfrak T)$ is equivalent with a mathematical technique in the analytic number theory so we are capable to redefine the topology via some techniques of analytic number theory, and it is worth noting that a subspace topology of the Euclidean topology on $[0.1,1]$ is the same $\mathfrak T$.

**step seven**: the strength of algebraic topology

**step eight** homotopy groups (no idea yet, only I can dream two spheres $S^2$)

$\color{red}{\text{Question}}$: which technique could be planned cited in the step six? (each open set in the topology anchors on a fixed mathematical technique of analytic number theory and on the other side each technique of that type clarifies only an open set so that this relation is two sided).

Thanks a lots in advance.

What are examples of well received mathematical papers in which the author provides detail on how a surprising solution to a problem has been found.

I am especially looking for papers that also document the dead ends of investigation, i.e. ideas that seemed promising but lead nowhere, and where the motivation and inspiration that lead to the right ideas came from.

By "surprising solution" I mean solutions that feel right at first reading and it isn't clear why they haven't been found earlier.

In Tate's famous paper about $p$-divisible groups, for any prime $p$ he asked whether there exist nontrivial $p$-divisible groups over $\mathbb Z$, i.e. $p$-divisible groups which are not a product of powers of $\mu_{p^\infty}$ and $\mathbb Q_p/ \mathbb Z_p $.

In works of V. A. Abrashkin, he claims that there exist nontrivial $p$-divisible groups for $p=2$ and some irregular primes. What about the general case? Do we know a lot of primes $p$ such that every $p$-divisible group over $\mathbb Z$ is trivial?

Motivation: On the other hand, we know there is no abelian scheme over $\mathbb Z$, whose proof involves $p$-divisible groups.

Edit: One answer below suspects the works of V. A. Abrashkin might contain errors, maybe a further reference is good. Anyway, the key question is what we know for general prime $p$, is there any progress towards Tate's question? With the technique of mordern $p$-adic Hodge theory, maybe such question could be solved.

Here's a fair-sequencing problem that doesn't quite match the usual fair-division problems. I think that, like those, the answer should also be the Thue-Morse sequence ("balanced alternation"), because the same heuristic reasoning that suggests it's the fairest way there works here as well, but the problem doesn't seem to *reduce* to them, so it's not obvious. (See here for more on using Thue-Morse for fair division, or this earlier MathOverflow question.)

Anyway, the problem is stage-striking (as is used in certain competitive video games for stage selection :) ). There are two players and $n+1$ objects ("stages"). The two players have different preferences regarding the stages (ideally, *opposite* preferences, but you'll see below why we don't assume that). The two players will take turns (in some order -- thus the question) removing ("striking") stages that have not already been struck; once only a single stage is left, that stage is selected (both players "get" it). The question, then, is what is the fairest order for stage striking; as mentioned above, I suspect it should be Thue-Morse (one player strikes on 0, the other player strikes on 1), for similar reasons that this is the answer to the old problem of what order to take turns in for fair division.

Of course this raises the question of how we're formalizing this and what we mean by "fair". I'll present here the formalization of the problem that (after discussing this with some other people) I think is best, but answers to other ways of formalizing it would also be OK so long as they don't trivialize the problem.

So -- note that if the players assign the stages opposite values (i.e. they agree about which stages give how much of an advantage to who), as you would expect, then the striking order becomes irrelevant, so long as both players get the same number of strikes; regardless of order, the median stage will be selected. So instead we have to assume the players may disagree about which stages advantage who. Also, since we can only really deal with the order of the stages here, we won't allow them to have arbitrary numeric values as in the fair-division problem; rather we'll assume each player assigns the $n+1$ stages the values $0, 1, \ldots, n$, so that the value of a stage to a player depends only on where it falls in their preference ordering.

Now, since perfect information makes the problem trivial, we'll go all the way in the opposite direction -- each player's preferences are uniformly random; or rather, each player sees the other's preferences as uniformly random. What we want to compare, then, and to make as equal as possible, is the expected value that player 1 gets (when player 2 strikes randomly), vs the expected value that player 2 gets (when player 1 strikes randomly).

(I'm pretty sure that, in this formulation, we can assume that each player always strikes their least-preferred stage at each step, and that there is no advantage from deviating from this. But obviously correct me if I'm wrong there...)

So, for instance, in this model, if $n=2$, then the first player to strike gets an expected value of $3/2$ (they eliminate their least preferred stage and get one of the remaining two at random), while the second player to strike gets an expected value of $5/3$ (they have a $2/3$ chance their most-preferred stage is not eliminated, and a $1/3$ chance they have to settle for the median). So we get a difference of $1/6$. You see?

So the question then is, is the Thue-Morse striking order the fairest? Or is it something else? Is it at least the fairest when $n$ is a power of $2$, even if it might not be otherwise?

**EDIT**: Actually, a thought -- maybe it should be *reverse* Thue-Morse? (As in, if $n=12$, you would go $011001101001$ rather than $011010011001$; you just reverse the sequence, and then, if necessary, swap the roles of the players so as to start with a $0$.) This seems possible because here it's going later, rather than going earlier, that seems to confer an advantage. Of course, if $n$ is a power of two, this distinction is irrelevant, as reversing the sequence would merely swap the roles of the players.

This question already has an answer here:

Consider $u,v∈S^{M-1}\subset \mathbb{C}^M$ to be two independent unit norm random vectors on the $M−1$ dimensional complex sphere $S^{M−1}$. In addition, $u$ follows an isotropic distribution, i.e., $u$ is uniformly distributed on the complex sphere $S^{M−1}$. What is the distribution of $Z=|u⋅v|^2$? This question has been asked before (Distribution of dot product of two unit random vectors), but I get a different result. I get that $Z$ follows Beta$(1,M−1)$ distribution by simulation.