• xmlui.mirage2.page-structure.header.title
    • français
    • English
  • Aide
  • Connexion
  • Langue 
    • Français
    • English
Consulter le document 
  •   Accueil
  • CEREMADE (UMR CNRS 7534)
  • CEREMADE : Publications
  • Consulter le document
  •   Accueil
  • CEREMADE (UMR CNRS 7534)
  • CEREMADE : Publications
  • Consulter le document
JavaScript is disabled for your browser. Some features of this site may not work without it.

Afficher

Toute la baseCentres de recherche & CollectionsAnnée de publicationAuteurTitreTypeCette collectionAnnée de publicationAuteurTitreType

Mon compte

Connexion

Enregistrement

Statistiques

Documents les plus consultésStatistiques par paysAuteurs les plus consultés
Thumbnail - Request a copy

Principal Component Analysis for Categorical Histogram Data: Some Open Directions of Research

Diday, Edwin (2011), Principal Component Analysis for Categorical Histogram Data: Some Open Directions of Research, dans Fichet, Bernard; Piccolo, Domenico; Verde, Rosanna; Vichi, Maurizio, Classification and Multivariate Analysis for Complex Data Structures, Springer : Berlin, p. 3-15. http://dx.doi.org/10.1007/978-3-642-13312-1_1

Type
Chapitre d'ouvrage
Date
2011
Titre de l'ouvrage
Classification and Multivariate Analysis for Complex Data Structures
Auteurs de l’ouvrage
Fichet, Bernard; Piccolo, Domenico; Verde, Rosanna; Vichi, Maurizio
Éditeur
Springer
Titre de la collection
Studies in Classification, Data Analysis, and Knowledge Organization
Ville d’édition
Berlin
Isbn
978-3-642-13311-4
Nombre de pages
473
Pages
3-15
Identifiant publication
http://dx.doi.org/10.1007/978-3-642-13312-1_1
Métadonnées
Afficher la notice complète
Auteur(s)
Diday, Edwin
Résumé (EN)
In recent years, the analysis of symbolic data where the units are categories, classes or concepts described by interval, distributions, sets of categories and the like becomes a challenging task since many applicative fields generate massive amount of data that are difficult to store and to analyze with traditional techniques [1]. In this paper we propose a strategy for extending standard PCA to such data in the case where the variables values are “categorical histograms” (i.e. a set of categories called bins with their relative frequency). These variables are a special case of “modal” variables (see for example, Diday and Noirhomme [5]) or of “compositional” variables (Aitchison [1]) where the weights are not necessarily frequencies. First, we introduce “metabins” which mix together bins of the different histograms and enhance interpretability. Standard PCA applied on the bins of such data table loose the histograms constraints and suppose independencies between the bins but copulas takes care of the probabilities and the underlying dependencies. Then, we give several ways for representing the units (called “individuals”), the bins, the variables and the metabins when the number of categories is not the same for each variable. A way for representing the variation of the individuals, for getting histograms in output is given. Finally, some theoretical results allow the representation of the categorical histogram variables inside a hypercube covering the correlation sphere.
Mots-clés
Multidimensional data; Data analysis

Publications associées

Affichage des éléments liés par titre et auteur.

  • Vignette de prévisualisation
    Principal component analysis for interval-valued observations 
    Diday, Edwin; Douzal-Chouakria, Ahlame; Billard, Lynne (2011) Article accepté pour publication ou publié
  • Vignette de prévisualisation
    Application of symbolic data analysis for structural modification assessment 
    Cury, Alexandre; Crémona, Christian; Diday, Edwin (2010) Article accepté pour publication ou publié
  • Vignette de prévisualisation
    Analyse en axes principaux de variables symboliques de type histogramme 
    Diday, Edwin; Makosso Kallyth, Sun (2010) Communication / Conférence
  • Vignette de prévisualisation
    Data analysis and informatics. Proceedings of the Second international Symposium on Data Analysis and Informatics, organised by the Institut de Recherche d'Informatique et d'automatique, Versailles, October 17-19, 1979. 
    Tomassone, R.; Pagès, J.P.; Lebart, Ludovic; Diday, Edwin (1979-10) Ouvrage
  • Vignette de prévisualisation
    From the statistics of data to the statistics of knowledge: Symbolic data analysis. 
    Billard, Lynne; Diday, Edwin (2003) Article accepté pour publication ou publié
Dauphine PSL Bibliothèque logo
Place du Maréchal de Lattre de Tassigny 75775 Paris Cedex 16
Tél. : 01 44 05 40 94
Contact
Dauphine PSL logoEQUIS logoCreative Commons logo