# Datasets

**Misinformation Detection During High Impact Events: An Application to COVID-19**

Social media has become an important communication channel during high impact events, such as the COVID-19 pandemic. As misinformation in social media can rapidly spread, creating social unrest, curtailing the spread of misinformation during such events is a significant data challenge. We present a labeled COVID-19 Twitter dataset based on socio-linguistic criteria that can be used for the study of misinformation during COVID-19 from both a machine learning as well as computational lingusitic perspective.

**COVID-19 Twitter Data**Data

Z. Boukouvalas, C. Mallinson, E. Crothers, N. Japkowicz, A. Piplai, M. Sudip, A. Joshi, T. Adali, "Independent Component Analysis for Trustworthy Cyberspace during High Impact Events: An Application to Covid-19 ," Article has been submitted for publication in the proceedings of the IEEE 30th International Workshop on MLSP

# Matlab Code

**Sparse ICA: Independence Vs Sparsity**

For a given dataset, BSS provides useful decompositions under minimum assumptions typically by making use of statistical properties---forms of diversity---of the data. Two popular forms of diversity that have proven useful for many applications are

*statistical independence*and

*sparsity*. Although many methods have been proposed for the solution of the BSS problem that take either the statistical independence or the sparsity of the data into account, there is no unified method that can take into account both forms of diversity simultaneously. The proposed algorithm, SparseICA by entropy bound minimization (SparseICA-EBM), inherits all the advantages of ICA by entropy bound minimization (ICA-EBM), namely its flexibility, though with enhanced performance due to the exploitation of the sparsity of the underlying sources (when they are indeed sparse) and enables direct control over the degree to which independence and sparsity are emphasized.

**Sparse ICA by entropy bound minimization**SparseICA-EBM

**References**

[1] Z. Boukouvalas, Y. Levin-Schwartz, Vince D. Calhoun, and T. Adali, "Sparsity and Independence: Balancing of two Objectives in Optimization for Source Separation with Application to fMRI Analysis," Elsevier, Journal of the Franklin Institute (JFI), Engineering and Applied Mathematics, 2017.

[2] Z. Boukouvalas, Y. Levin-Schwartz, and T. Adali, "Enhancing ICA Performance By Exploiting Sparsity: Application to fMRI Analysis," In the proceedings of

*2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)*, New Orleans, LA, March 2017 pp 2532 - 2536.

**Multivariate Generalized Gaussian Distribution (MGGD) and Parameter Estimation**

Multivariate generalized Gaussian distribution (MGGD) has been an attractive solution to many signal processing problems due to its simple yet flexible parametric form, which requires the estimation of only a few parameters, i.e., the scatter matrix and the shape parameter. We present the code for generating realizations from the MGGD as well as estimating its parameters [1]. The MGGD can be characterized using two parameters, the scatter matrix and the shape parameter. If the shape parameter is less than 1 the distribution of the marginals is super-Gaussian (i.e. more peaky, with heavier tails) and if the shape parameter is greater than 1, the distribution of the marginals is sub-Gaussian (i.e., less peaky with lighter tails). If shape parameter is equal to 1, then we generate multivariate Gaussian sources.

**MGGD generation and parameter estimation**MGGD-Generation-Estimation

**References**

[1] Z. Boukouvalas, S. Said, L. Bombrun, Y. Berthoumieu, and T. Adali, " A new Riemannian averaged fixed-point algorithm for MGGD parameter estimation," IEEE Signal Proc. Letts., vol. 22, no. 12, pp. 2314-2318, Dec. 2015.

**Independent Vector Analysis with Adaptive MGGD (IVA-A-GGD)**

Due to each flexibility, MGGD provides an effective model for IVA. Modeling the latent multivariate variables--sources--the performance of the IVA algorithm highly depends on the estimation of the source parameters. We present two different IVA-A-GGD algorithms that estimate the shape parameter and scatter matrix jointly, while taking both SOS and HOS into account. The first algorithm is based on a Fisher scoring (FS) algorithm [1] (IVA-A-GGD-MLFS) and the second on a fixed point (FP) algorithm [2] IVA-A-GGD-RAFP.

**IVA-A-GGD algorithms**IVA-A-GGD

**References**

[1] Z. Boukouvalas, G.-S. Fu, and T. Adali, "An efficient multivariate generalized Gaussian distribution estimator: Application to IVA," in

*Proc. Conf. on Info. Sciences and Systems (CISS),*Baltimore, MD, March 2015.

[2] Z. Boukouvalas, S. Said, L. Bombrun, Y. Berthoumieu, and T. Adali, " A new Riemannian averaged fixed-point algorithm for MGGD parameter estimation," IEEE Signal Proc. Letts., vol. 22, no. 12, pp. 2314-2318, Dec. 2015.

# Python Code

**Independence Vector Analysis in Python**

pyiva is a python package which implements the independent vector analysis (IVA) using a multivariate Laplace prior.

**pyIVA**