Questions tagged [machine-learning]
How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?
3,357
questions
0
votes
0
answers
32
views
How to estimate the inverse of a non-invertible matrix?
So I'm working on a machine learning problem where my solution requires taking the inverse of a matrix at some point. The problem is that this matrix is sometimes non-invertible. In theory the the ...
-1
votes
0
answers
28
views
Distribution of two combined ML models
Due to the complexity of the problem, the problem was divided into two models: a stationary model and a model that corrects the stationary model for temporal effects, i.e.
$X = X_{stat} + X_{time}$
...
-2
votes
1
answer
96
views
EM algorithm for estimating worker ability [closed]
I cannot understand how to get results in M-step.
This formulation is from https://papers.nips.cc/paper_files/paper/2012/file/cd00692c3bfe59267d5ecfac5310286c-Paper.pdf
0
votes
0
answers
28
views
Formulating a solution ansatz for the 1D heat equation in polar coordinates to learn the PDE in a PINN setting
Hello Math Stack Exchange Community,
I am working on solving a partial differential equation (PDE) with a neural network in a PINN-like fashion, and I am seeking advice on identifying an appropriate ...
6
votes
1
answer
68
views
What is the collection of functions that a given finite neural network can approximate with ease?
To my understanding, one version of the universal approximation theorem runs as follows: Let $\Phi$ be the family of (trained) feedforward neural networks of bounded width, arbitrary depth, and mild ...
0
votes
1
answer
23
views
Question about likelhood function of discriminative models
Im a little confused with the likelihood function. For discriminative models, we have a hypothesis function $h_{\theta}(x) = p(y \mid x ; \theta)$. Using the principles of maximim likelihood we want ...
0
votes
0
answers
25
views
Calculating functional derivative for a Physics-Informed Neural Network (PINN) using Automatic Differentiation
I'm working with a Physics-Informed Neural Network (PINN) to approximate the solution of a 1D Poisson equation:
$\frac{d^2u}{dx^2} = f$
Here, I have an MLP with weight parameters $\theta$ that takes a ...
1
vote
0
answers
19
views
How many vectors can be placed in $n$ dimensions given max cosine similarity? [duplicate]
In machine learning we usually use the concept of cosine similarity to compare things. Similar things should have embeddings with high cosine similarity and different things should have embeddings ...
0
votes
0
answers
47
views
Self-Organizing maps: why input vectors (x) are dependent on steps (t)?
Based on the paper Essentials of the Self-Organizing maps, I rephrase paragraph 4.1. ->The original, stepwise recursive SOM algorithm:
In the mathematical framework $\{\mathbf{x}(t)\}$ represents ...
1
vote
0
answers
40
views
FGSM for logistic regression
In arXiv:1412.6572 (https://arxiv.org/pdf/1412.6572, a seminal article), it is stated that
$$\mathrm{sgn}(\nabla_{\mathbf{x}} L(\mathbf{x},y,\mathbf{w})) = -\mathrm{sgn}(\mathbf{w})$$
for the softplus ...
0
votes
0
answers
22
views
Sample complexity bounds of $L_S(h)$
Fix $\mathscr{H} \subset \mathscr{Y}^\mathscr{X}$ and a loss $\ell : \hat{Y} \times Y \to [0,1]$. Fix $S \in (\mathscr{X} \times \mathscr{Y})^{2m}$. Assume for now that $S$ is not random. Suppose we ...
1
vote
1
answer
48
views
sorting functions by amount of conditions for a random dataset to be described using it?
Given a finite dataset like 1, 2, 3, 4 You could find infinite functions, for simplicity I found 2:
Add 1 for the next data point, so the sequence continues as 5, 6, etc.
2.Cycle through 1, 2, 3, 4, ...
0
votes
0
answers
31
views
Strong convexity and Lipschitz-continuous gradients, how restrictive are these assumptions in practice?
I am reading a paper on stochastic gradient descent and different varieties of it. For all the convergence proofs the author assumes strong convexity and Lipschitz-continuous gradients for the ...
0
votes
0
answers
14
views
Information coefficient as loss function of XGBoost
$$ IC =
\frac{\frac{1}{n}\hat{y}^Ty-\mathrm{E}\left[ \hat{y} \right] \mathrm{E}\left[ y \right]}{\sigma \left[ \hat{y} \right] \sigma \left[ y \right]}
$$
XGBoost requires a gradient and a Hessian of ...
1
vote
0
answers
71
views
Relation between values of $ξ_i$ and $\alpha_i$ in SVM?
I have a question in about a property of support vectors of SVM which is stated in subsection "12.2.1 Computing the Support Vector Classifier" of "The Elements of Statistical Learning&...