Skip to main content

Questions tagged [machine-learning]

How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?

0 votes
0 answers
32 views

How to estimate the inverse of a non-invertible matrix?

So I'm working on a machine learning problem where my solution requires taking the inverse of a matrix at some point. The problem is that this matrix is sometimes non-invertible. In theory the the ...
Dr.'s user avatar
  • 11
-1 votes
0 answers
28 views

Distribution of two combined ML models

Due to the complexity of the problem, the problem was divided into two models: a stationary model and a model that corrects the stationary model for temporal effects, i.e. $X = X_{stat} + X_{time}$ ...
xbc68's user avatar
  • 1
-2 votes
1 answer
96 views

EM algorithm for estimating worker ability [closed]

I cannot understand how to get results in M-step. This formulation is from https://papers.nips.cc/paper_files/paper/2012/file/cd00692c3bfe59267d5ecfac5310286c-Paper.pdf
Hwang Jeong Yeon's user avatar
0 votes
0 answers
28 views

Formulating a solution ansatz for the 1D heat equation in polar coordinates to learn the PDE in a PINN setting

Hello Math Stack Exchange Community, I am working on solving a partial differential equation (PDE) with a neural network in a PINN-like fashion, and I am seeking advice on identifying an appropriate ...
alighato's user avatar
6 votes
1 answer
68 views

What is the collection of functions that a given finite neural network can approximate with ease?

To my understanding, one version of the universal approximation theorem runs as follows: Let $\Phi$ be the family of (trained) feedforward neural networks of bounded width, arbitrary depth, and mild ...
SapereAude's user avatar
0 votes
1 answer
23 views

Question about likelhood function of discriminative models

Im a little confused with the likelihood function. For discriminative models, we have a hypothesis function $h_{\theta}(x) = p(y \mid x ; \theta)$. Using the principles of maximim likelihood we want ...
Joe Jameson's user avatar
0 votes
0 answers
25 views

Calculating functional derivative for a Physics-Informed Neural Network (PINN) using Automatic Differentiation

I'm working with a Physics-Informed Neural Network (PINN) to approximate the solution of a 1D Poisson equation: $\frac{d^2u}{dx^2} = f$ Here, I have an MLP with weight parameters $\theta$ that takes a ...
Yanyan Wang's user avatar
1 vote
0 answers
19 views

How many vectors can be placed in $n$ dimensions given max cosine similarity? [duplicate]

In machine learning we usually use the concept of cosine similarity to compare things. Similar things should have embeddings with high cosine similarity and different things should have embeddings ...
F. Bruno Dias's user avatar
0 votes
0 answers
47 views

Self-Organizing maps: why input vectors (x) are dependent on steps (t)?

Based on the paper Essentials of the Self-Organizing maps, I rephrase paragraph 4.1. ->The original, stepwise recursive SOM algorithm: In the mathematical framework $\{\mathbf{x}(t)\}$ represents ...
Nauel's user avatar
  • 51
1 vote
0 answers
40 views

FGSM for logistic regression

In arXiv:1412.6572 (https://arxiv.org/pdf/1412.6572, a seminal article), it is stated that $$\mathrm{sgn}(\nabla_{\mathbf{x}} L(\mathbf{x},y,\mathbf{w})) = -\mathrm{sgn}(\mathbf{w})$$ for the softplus ...
SEJ's user avatar
  • 47
0 votes
0 answers
22 views

Sample complexity bounds of $L_S(h)$

Fix $\mathscr{H} \subset \mathscr{Y}^\mathscr{X}$ and a loss $\ell : \hat{Y} \times Y \to [0,1]$. Fix $S \in (\mathscr{X} \times \mathscr{Y})^{2m}$. Assume for now that $S$ is not random. Suppose we ...
isaac's user avatar
  • 41
1 vote
1 answer
48 views

sorting functions by amount of conditions for a random dataset to be described using it?

Given a finite dataset like 1, 2, 3, 4 You could find infinite functions, for simplicity I found 2: Add 1 for the next data point, so the sequence continues as 5, 6, etc. 2.Cycle through 1, 2, 3, 4, ...
Anonymous's user avatar
0 votes
0 answers
31 views

Strong convexity and Lipschitz-continuous gradients, how restrictive are these assumptions in practice?

I am reading a paper on stochastic gradient descent and different varieties of it. For all the convergence proofs the author assumes strong convexity and Lipschitz-continuous gradients for the ...
Sen90's user avatar
  • 453
0 votes
0 answers
14 views

Information coefficient as loss function of XGBoost

$$ IC = \frac{\frac{1}{n}\hat{y}^Ty-\mathrm{E}\left[ \hat{y} \right] \mathrm{E}\left[ y \right]}{\sigma \left[ \hat{y} \right] \sigma \left[ y \right]} $$ XGBoost requires a gradient and a Hessian of ...
atlantic0cean's user avatar
1 vote
0 answers
71 views

Relation between values of $ξ_i$ and $\alpha_i$ in SVM?

I have a question in about a property of support vectors of SVM which is stated in subsection "12.2.1 Computing the Support Vector Classifier" of "The Elements of Statistical Learning&...
hasanghaforian's user avatar

15 30 50 per page
1
2 3 4 5
224