Diseño de un agregador para la gestión de los big data informativos

  1. Manuel Blázquez-Ochando 1
  1. 1 Universidad Complutense de Madrid
    info

    Universidad Complutense de Madrid

    Madrid, España

    ROR 02p0gd045

Revue:
El profesional de la información

ISSN: 1386-6710 1699-2407

Année de publication: 2016

Titre de la publication: Datos

Volumen: 25

Número: 4

Pages: 671-683

Type: Article

DOI: 10.3145/EPI.2016.JUL.17 DIALNET GOOGLE SCHOLAR lock_openAccès ouvert editor

D'autres publications dans: El profesional de la información

Résumé

The design and characteristics of a new open source content aggregation program, AXYZ, are described. Several features of the program standout, including the processing engine of syndication channels, monitoring capability of information recovery in real time, possibility of configuration of the aggregator behavior, automatic content classification, and new models for representing information from relational interactive maps. On the other hand, the aggregation program is designed to manage thousands of syndication channels in the RSS format. It also provides statistics that can be used to study the production of any information producer and the impact of the information published in other sources. The AXYZ modules are capable of comparing the relationship between news or information from different sources and the degree of influence which is detected by patterns.

Références bibliographiques

  • Bansal, Srividya K.; Kagemann, Sebastian (2015). “Integrating big data: A semantic extract-transform-load framework”. Computer, v. 48, n. 3, pp. 42-50. http://doi.ieeecomputersociety.org/10.1109/MC.2015.76
  • Bazargani, Sahar; Brinkley, Julian; Tabrizi, Nassehzadeh (2013). “Implementing conceptual search capability in a cloud-based feed aggregator”. En: 3rd Intl conf on innovative computing technology (Intech), 29-31 Aug., pp. 138-143. http://dx.doi.org/10.1109/INTECH.2013.6653631
  • BuiltWith (2016). CMS usage statistics. Statistics for websites using CMS technologies. http://trends.builtwith.com/cms
  • Carlson, Matt; Usher, Nikki (2015). “News startups as agents of innovation: For-profit digital news startup manifestos as metajournalistic discourse”. Digital journalism, v. 4, n. 5, pp. 1-19. http://dx.doi.org/10.1080/21670811.2015.1076344
  • Chen, Philip; Zhang, Chun-Yang (2014). “Data-intensive applications, challenges, techniques and technologies: A survey on big data”. Information sciences, v. 275, pp. 314-347. https://goo.gl/kQJMSH http://dx.doi.org/10.1016/j.ins.2014.01.015
  • Chen, Weiqin; Bøen, Torbjørn (2008). “A personalized RSS news filtering agent”. En: Ellis, Richard; Allen, Tony; Petridis, Miltos. Applications and innovations in intelligent systems XV. Londres: Springer, pp. 321-326. ISBN: 978 1 84800 086 5 http://dx.doi.org/10.1007/978-1-84800-086-5_25
  • Colle, Raymond (2013). “Prensa y big data: el desafío de la acumulación y análisis de datos”. Mediterranean journal of communication, v. 4, n. 1, pp. 275-282. http://dx.doi.org/10.14198/MEDCOM2013.4.1.13
  • Creus, Jordi; Amann, Bernd; Travers, Nicolas; Vodislav, Dan (2011). “RoSeS: A continuous content-based query engine for RSS feeds”. En: Procs of the 20th ACM Intl conf on information and knowledge management, 2011, pp. 2549-2552. http://cedric.cnam.fr/fichiers/art_2086.pdf http://dx.doi.org/10.1145/2063576.2064016
  • Cuzzocrea, Alfredo (2015). “Aggregation and multidimensional analysis of big data for large-scale scientific applications: models, issues, analytics, and beyond”. En: Procs of the 27th Intl conf on scientific and statistical database management, pp. 23. http://dx.doi.org/10.1145/2791347.2791377
  • España (2014). “Ley 21/2014, de 4 de noviembre, por la que se modifica el texto refundido de la Ley de propiedad intelectual, aprobado por Real decreto legislativo 1/1996, de 12 de abril, y la Ley 1/2000, de 7 de enero, de enjuiciamiento civil”. BOE, n. 268, 5 de noviembre, pp. 90404-90439. https://www.boe.es/diario_boe/txt.php?id=BOE-A-2014-11404
  • Franz, Max; Lopes, Christian T.; Huck, Gerardo; Dong, Yue; Sumer, Onur; Bader, Gary D. (2016). “Cytoscape.js: a graph theory library for visualisation and analysis”. Bioinformatics, v. 32, n. 2, pp. 309-311. http://dx.doi.org/10.1093/bioinformatics/btv557
  • Gallé, Matthias; Renders, Jean-Michel; Karstens, Eric (2013). “Who broke the news?: an analysis on first reports of news events”. En: Procs of the 22nd Intl conf on World Wide Web companion, pp. 855-862. http://www2013.org/companion/p855.pdf http://dx.doi.org/10.1145/2487788.2488066
  • Garfield, Eugene (2006). “The history and meaning of the journal impact factor”. Jama, v. 295, n. 1, pp. 90-93. http://garfield.library.upenn.edu/papers/jamajif2006.pdf http://dx.doi.org/10.1001/jama.295.1.90
  • Guallar, Javier (2015). “Prensa digital en 2013-2014”. Anuario ThinkEPI, v. 9, pp. 153-160. http://dx.doi.org/10.3145/thinkepi.2015.37
  • Guallar, Javier; Leiva-Aguilera, Javier (2013). El content curator. Guía básica para el nuevo profesional de internet. Barcelona: UOC. Colección El profesional de la información, n. 24. ISBN: 978 84 9064 018 0
  • Haberkern, Timo (2007). Server2Go. http://www.server2go-download.de/download/server2go_ a22_psm.zip
  • Hmedeh, Zeinab; Vouzoukidou, Nelly; Travers, Nicolas; Christophides, Vassilis; Du-Mouza, Cedric; Scholl, Michel (2011). “Characterizing web syndication behavior and content”. En: Bouguettaya, Athman; Hauswirth, Manfred; Liu, Ling. Web information system engineering (2011). Sydney: Springer, pp. 29-42. ISBN: 978 3 642 24434 6 http://cedric.cnam.fr/fichiers/art_2162.pdf http://dx.doi.org/10.1007/978-3-642-24434-6_3
  • Horincar, Roxana; Amann, Bernd; Artières, Thierry (2010). “Best-effort refresh strategies for content-based RSS feed aggregation”. En: Chen, Lei; Triantafillou, Peter; Suel, Torsten. Web information systems engineering (2010). Hong Kong: Springer, pp. 262-270. ISBN: 978 3 642 17616 6 http://dx.doi.org/10.1007/978-3-642-17616-6_24
  • Isah, Haruna (2012). “Full data controlled web-based feed aggregator”. International journal of computer science & information technology, v. 4, n. 3, pp. 71-84. http://dx.doi.org/10.5121/ijcsit.2012.4307
  • Katakis, Ioannis; Tsoumakas, Grigorios; Banos, Evangelos; Bassiliades, Nick; Vlahavas, Ioannis (2009). “An adaptive personalized news dissemination system”. Journal of intelligent information systems, v. 32, n. 2, pp. 191-212. http://dx.doi.org/10.1007/s10844-008-0053-8
  • Leaver, Trama; Willson, Michele; Balnaves, Mark (2012). “Transparency and the ubiquity of information filtration?”. Ctrl-Z: New media philosophy, v. 1, n. 2. http://www.ctrl-z.net.au/articles/leaver-willson-balnavestransparency-and-the-ubiquity-of-information-filtration
  • Lee, Bum-Suk; Im, Jin-Woo; Hwang, Byung-Yeon; Zhang, Du (2008). “Design of an RSS crawler with adaptive revisit manager”. En: SEKE, 2008, pp. 219-222. http://dblp.uni-trier.de/db/conf/seke/seke2008.html
  • Li, Xin; Yan, Jun; Deng, Zhihong; Ji, Lei; Fan, Weiguo; Zhang, Benyu; Chen, Zheng (2007). “A novel clustering-based RSS aggregator”. En: Procs of the 16th Intl conf on World Wide Web, 2007, pp. 1309-1310. ISBN: 978 1 59593 654 7 http://www2007.org/posters/poster931.pdf http://dx.doi.org/10.1145/1242572.1242824
  • López-Maza, Sebastián (2015). “El límite sobre agregadores y buscadores”. En: Rodríguez-Cano, Rodrigo B. La reforma de la Ley de propiedad intelectual. Valencia: Tirant lo Blanch, pp. 89-111. ISBN: 978 84 9086 664 1
  • Marty, Emmanuel; Rebillard, Franck; Smyrnaios, Nikos; Touboul, Annelise (2010). “Variété et distribution des sujets d’actualité sur Internet. Une analyse quantitative de l’information en ligne”. Mots. Les langages du politique, n. 93, pp. 107-126. http://dx.doi.org/10.4000/mots.19832
  • Mayer-Schönberger, Viktor; Cukier, Kenneth (2013). Big data: la revolución de los datos masivos. Madrid: Turner. ISBN: 978 84 15427 81 0
  • Messina, Alberto; Montagnuolo, Maurizio (2009). “A generalised cross-modal clustering method applied to multimedia news semantic indexing and retrieval”. En: Procs of the 18th Intl conference on World Wide Web, 2009, pp. 321-330. ISBN: 978 1 60558 487 4 http://ra.ethz.ch/CDstore/www2009/proc/docs/p321.pdf http://dx.doi.org/10.1145/1526709.1526753
  • O’Riordan, Adrian P.; O’Mahoney, Oliver (2011). “Engineering an open web syndication interchange with discovery and recommender capabilities”. Journal of digital information, v. 12, n. 1. https://cora.ucc.ie/handle/10468/980
  • Reichert, Sandro; Urbansky, David; Muthmann, Klemens; Katz, Philipp; Wauer, Matthias; Schill, Alexander (2011). “Feeding the world: a comprehensive dataset and analysis of a real world snapshot of web feeds”. En: Procs of the 13th Intl conf on information integration and web-based applications and services, 2011, pp. 44-51. ISBN: 978 1 4503 0784 0 http://dx.doi.org/10.1145/2095536.2095546
  • Rodríguez-Cano, Rodrigo B. (2015). “Tasa Google o canon AEDE: una reforma desacertada”. Aranzadi civil-mercantil. Revista doctrinal, v. 1, n. 11, pp. 53-94.
  • Samper, Juan J.; Castillo, Pedro A.; Araujo, Lourdes; Merelo, Juan J.; Cordon, Oscar; Tricas, Fernando (2008). “NectaRSS, an intelligent RSS feed reader”. Journal of network and computer applications, v. 31, n. 4, pp. 793-806. http://dx.doi.org/10.1016/j.jnca.2007.09.001
  • Severo, Marta; Beauguitte, Laurent; Pecout, Hugues (2015). “Archiving news on the Web through RSS flows. A new tool for studying international events”. En: Resaw. Web archives as scholarly sources: Issues, practices and perspectives. https://halshs.archives-ouvertes.fr/halshs-01187828
  • Sia, Ka-Cheung; Cho, Junghoo; Cho, Hyun-Kyu (2007). “Efficient monitoring algorithm for fast news alerts”. Knowledge and data engineering, v. 19, n. 7, pp. 950-961. http://dx.doi.org/10.1109/TKDE.2007.1041
  • Thelwall, Mike; Prabowo, Rudy; Fairclough, Ruth (2006). “Are raw RSS feeds suitable for broad issue scanning? A science concern case study”. Journal of the American Society for Information Science and Technology, v. 57, n. 12, pp. 1644-1654. http://dx.doi.org/10.1002/asi.20334
  • Travers, Nicolas; Hmedeh, Zeinab; Vouzoukidou, Nelly; Du-Mouza, Cedric; Christophides, Vassilis; Scholl, Michel (2014). “RSS feeds behavior analysis, structure and vocabulary”. International journal of web information systems, v. 10, n. 3, pp. 291-320. http://dx.doi.org/10.1108/IJWIS-06-2014-0023