Title: | Handle Missing Tensor Data with C++ Integration |
---|---|
Description: | To handle higher-order tensor data. See Kolda and Bader (2009) <doi:10.1137/07070111X> for details on tensor. While existing packages on tensor data extend the base 'array' class to some data classes, this package serves as an alternative resort to handle tensor only as 'array' class. Some functionalities related to missingness are also supported. |
Authors: | Zetai Cen [aut, cre] |
Maintainer: | Zetai Cen <[email protected]> |
License: | GPL-3 |
Version: | 1.1.1 |
Built: | 2024-11-06 04:58:35 UTC |
Source: | https://github.com/cran/tensorMiss |
Computing the column space distance between two matrix
fle(A1, A2)
fle(A1, A2)
A1 |
A matrix of m rows and n columns. |
A2 |
A matrix of m rows and l columns where l can equal n. |
A numeric number
fle(matrix(1:12, nrow=4), matrix(11:22, nrow=4));
fle(matrix(1:12, nrow=4), matrix(11:22, nrow=4));
Estimate the factor structure on an order-K tensor at each time t, with maximum K as 3 and missing entries allowed
miss_factor_est(dt, r = 0, delta = 0.2)
miss_factor_est(dt, r = 0, delta = 0.2)
dt |
Tensor time series, written in an array with dimension K+1 and mode-1 as the time mode. |
r |
Rank of core tensors, written in a vector of length K. First value as 0 is to denote unknown rank which would be automatically estimated using ratio-based estimators. Default is 0. |
delta |
Non-negative number as the correction parameter for rank estimation. Default is 0.2. |
A list containing the following: r: a vector representing either the given rank or the estimated rank, with length K; A: a list of estimated K factor loading matrices; Ft: the estimated core factor series, as multi-dimensional array with dimension K+1, where mode-1 is the time mode; imputation: the imputed common component time series, as multi-dimensional array with dimension K+1, where mode-1 is the time mode; covMatrix: a list of estimated covariance matrix which are used to estimate loading matrices;
K = 3; TT = 10; d = c(20,20,20); r = c(2,2,2); re = c(2,2,2); eta = list(c(0,0), c(0,0), c(0,0)); coef_f = c(0.7, 0.3, -0.4, 0.2, -0.1); coef_fe = c(-0.7, -0.3, -0.4, 0.2, 0.1); coef_e = c(0.8, 0.4, -0.4, 0.2, -0.1); data_test = tensor_gen(K,TT,d,r,re,eta, coef_f, coef_fe, coef_e); data_miss = miss_gen(data_test$X); miss_factor_est(data_miss, r);
K = 3; TT = 10; d = c(20,20,20); r = c(2,2,2); re = c(2,2,2); eta = list(c(0,0), c(0,0), c(0,0)); coef_f = c(0.7, 0.3, -0.4, 0.2, -0.1); coef_fe = c(-0.7, -0.3, -0.4, 0.2, 0.1); coef_e = c(0.8, 0.4, -0.4, 0.2, -0.1); data_test = tensor_gen(K,TT,d,r,re,eta, coef_f, coef_fe, coef_e); data_miss = miss_gen(data_test$X); miss_factor_est(data_miss, r);
Assign missingness to a given order-K tensor time series, where the maximum K is 4
miss_gen(dt, type = "random", p = 0.7)
miss_gen(dt, type = "random", p = 0.7)
dt |
Tensor time series, written in an array with dimension K+1 and mode-1 as the time mode. |
type |
Type of missingness, where "random" is random missing with probability p, "simul" is missingness on the last half along all dimensions, "mix" is a mixture of "random" and "simul". Default is "random". |
p |
If type is "random", then each entry is randomly missing with probability 1-p. Default is 0.7. |
A multi-dimensional array with dimension K+1, where mode-1 is the time mode and missing entries are denoted by NA
K = 3; TT = 10; d = c(20,20,20); r = c(2,2,2); re = c(2,2,2); eta = list(c(0,0), c(0,0), c(0,0)); coef_f = c(0.7, 0.3, -0.4, 0.2, -0.1); coef_fe = c(-0.7, -0.3, -0.4, 0.2, 0.1); coef_e = c(0.8, 0.4, -0.4, 0.2, -0.1); tensor_gen(K,TT,d,r,re,eta, coef_f, coef_fe, coef_e); data_test = tensor_gen(K,TT,d,r,re,eta, coef_f, coef_fe, coef_e); miss_gen(data_test$X);
K = 3; TT = 10; d = c(20,20,20); r = c(2,2,2); re = c(2,2,2); eta = list(c(0,0), c(0,0), c(0,0)); coef_f = c(0.7, 0.3, -0.4, 0.2, -0.1); coef_fe = c(-0.7, -0.3, -0.4, 0.2, 0.1); coef_e = c(0.8, 0.4, -0.4, 0.2, -0.1); tensor_gen(K,TT,d,r,re,eta, coef_f, coef_fe, coef_e); data_test = tensor_gen(K,TT,d,r,re,eta, coef_f, coef_fe, coef_e); miss_gen(data_test$X);
Computing the q-quantile relative squared error as a generalised error measure on relative mean squared error
qMSE(x_true, x_est, q = 100)
qMSE(x_true, x_est, q = 100)
x_true |
True values, written in a vector of length n. |
x_est |
Imputed or estimated values, written in a vector of length n. |
q |
Number of partition intervals. If q equals n, then output is essentially relative mean squared error. Default is 100. |
A numeric number
qMSE(c(2, 3, 7, 1), c(-2, 0.5, 8, 2), 1);
qMSE(c(2, 3, 7, 1), c(-2, 0.5, 8, 2), 1);
Performing to matrices tensorisation, which is the inverse process of unfolding
refold(unfolding, k, dim_vec)
refold(unfolding, k, dim_vec)
unfolding |
A matrix. |
k |
An integer specifying the mode of array to refold from. |
dim_vec |
A vector specifying the expected dimension of output array. |
A multi-dimensional array
refold(matrix(1:9,nrow=3), 1, c(3,1,3));
refold(matrix(1:9,nrow=3), 1, c(3,1,3));
Computing the HAC covariance estimator for asymptotic normality on each row j of the mode-k loading matrix estimator, with maximum order of tensor time series as 3
sigmaD(k, D, Q, C, Y, j, beta = 0)
sigmaD(k, D, Q, C, Y, j, beta = 0)
k |
Mode of loading matrix. |
D |
Eigenvalue matrix of sample covariance matrix, with dimension rk by rk. |
Q |
Estimated mode-k loading matrix, with dimension Ik by rk. |
C |
Estimated common component series, written in an array with dimension K+1 and mode-1 as the time mode. |
Y |
Observed time series with missingness allowed, written in an array with dimension K+1 and mode-1 as the time mode. |
j |
Integer representing the row of mode-k loading matrix. Value should be integers from minimum 1 to maximum Ik. |
beta |
Lag parameter of the HAC type. Default is 0. |
A matrix of dimension rk by rk
K = 3; TT = 10; d = c(20,20,20); r = c(2,2,2); re = c(2,2,2); eta = list(c(0,0), c(0,0), c(0,0)); coef_f = c(0.7, 0.3, -0.4, 0.2, -0.1); coef_fe = c(-0.7, -0.3, -0.4, 0.2, 0.1); coef_e = c(0.8, 0.4, -0.4, 0.2, -0.1); data_test = tensor_gen(K,TT,d,r,re,eta, coef_f, coef_fe, coef_e); data_miss = miss_gen(data_test$X); data_est = miss_factor_est(data_miss, r); D = diag(x=(svd(data_est$covMatrix[[2]])$d)[1:2], nrow=2, ncol=2); sigmaD(2, D, data_est$A[[2]], data_est$imputation, data_miss, 2, 2);
K = 3; TT = 10; d = c(20,20,20); r = c(2,2,2); re = c(2,2,2); eta = list(c(0,0), c(0,0), c(0,0)); coef_f = c(0.7, 0.3, -0.4, 0.2, -0.1); coef_fe = c(-0.7, -0.3, -0.4, 0.2, 0.1); coef_e = c(0.8, 0.4, -0.4, 0.2, -0.1); data_test = tensor_gen(K,TT,d,r,re,eta, coef_f, coef_fe, coef_e); data_miss = miss_gen(data_test$X); data_est = miss_factor_est(data_miss, r); D = diag(x=(svd(data_est$covMatrix[[2]])$d)[1:2], nrow=2, ncol=2); sigmaD(2, D, data_est$A[[2]], data_est$imputation, data_miss, 2, 2);
Generate an order-K tensor at each time t, with the first mode as the time mode and maximum allowed K is 4
tensor_gen( K, TT, d, r, re, eta, coef_f, coef_fe, coef_e, heavy_tailed = FALSE, t_df = 3, seed = 2023 )
tensor_gen( K, TT, d, r, re, eta, coef_f, coef_fe, coef_e, heavy_tailed = FALSE, t_df = 3, seed = 2023 )
K |
Order of the generated tensor at each time t. |
TT |
Length of time series. |
d |
Dimensions of each mode of the tensor, written in a vector of length K. |
r |
Rank of the core tensors, written in a vector of length K. |
re |
Rank of the cross-sectional common error core tensors, written in a vector of length K. |
eta |
Quantities controlling factor strengths in each factor loading matrix, written in a list of K vectors. |
coef_f |
AR(5) coefficients for the factor series, written in a vector of length 5. |
coef_fe |
AR(5) coefficients for the common component in error series, written in a vector of length 5. |
coef_e |
AR(5) coefficients for the idiosyncratic component in error series, written in a vector of length 5. |
heavy_tailed |
Whether to generate data from heavy-tailed distribution. If FALSE, generate from N(0,1); if TRUE, generate from t-distribution. Default is FALSE. |
t_df |
The degree of freedom for t-distribution if heavy_tailed = TRUE. Default is 3. |
seed |
Random seed required for reproducibility. Default is 2023. |
A list containing the following: X: the generated tensor time series, as multi-dimensional array with dimension K+1, where mode-1 is the time mode; A: a list of K factor loading matrices; C: the generated common component time series, as multi-dimensional array with dimension K+1, where mode-1 is the time mode; Ft: the generated core factor series, as multi-dimensional array with dimension K+1, where mode-1 is the time mode;
K = 3; TT = 10; d = c(20,20,20); r = c(2,2,2); re = c(2,2,2); eta = list(c(0,0), c(0,0), c(0,0)); coef_f = c(0.7, 0.3, -0.4, 0.2, -0.1); coef_fe = c(-0.7, -0.3, -0.4, 0.2, 0.1); coef_e = c(0.8, 0.4, -0.4, 0.2, -0.1); tensor_gen(K,TT,d,r,re,eta, coef_f, coef_fe, coef_e);
K = 3; TT = 10; d = c(20,20,20); r = c(2,2,2); re = c(2,2,2); eta = list(c(0,0), c(0,0), c(0,0)); coef_f = c(0.7, 0.3, -0.4, 0.2, -0.1); coef_fe = c(-0.7, -0.3, -0.4, 0.2, 0.1); coef_e = c(0.8, 0.4, -0.4, 0.2, -0.1); tensor_gen(K,TT,d,r,re,eta, coef_f, coef_fe, coef_e);
Performing k-mode matrix product of a tensor to a matrix
ttm(ten, A, k)
ttm(ten, A, k)
ten |
A multi-dimensional array with the mode-k dimension m. |
A |
A matrix with dimension n by m. |
k |
An integer specifying the tensor mode to perform k-mode matrix product. |
A multi-dimensional array with the k mode dimension n
ttm(array(1:24,c(3,4,2)), matrix(1:4,nrow =2), 3);
ttm(array(1:24,c(3,4,2)), matrix(1:4,nrow =2), 3);
Performing to multi-dimensional arrays tensor unfolding, also known as matricization
unfold(ten, k)
unfold(ten, k)
ten |
A multi-dimensional array. |
k |
An integer specifying the mode of array to unfold. |
A matrix
unfold(array(1:24, dim=c(3,4,2)), 2);
unfold(array(1:24, dim=c(3,4,2)), 2);