2020 |
Tosi, Mauro Dalle Lucca
Constructing Knowledge Graphs from Textual Documents for Scientific Literature Analysis (mastersthesis)
mastersthesis,
2020.
(
Abstract |
Links |
BibTeX |
Tags:
Graph (Computer system),
Complex networks,
Centrality (Graph Theory),
Semantic computing,
Scientific knowledge
)
@mastersthesis{tosi2020,
abstract = {The amount of publications a researcher must absorb has been increasing over the last years. Consequently, among so many options, it is hard for them to identify interesting documents to read related to their studies. Researchers usually search for review articles to understand how a scientific field is organized and to study its state of the art. This option can be unavailable or outdated depending on the studied area. Usually, they have to do such laborious task of background research manually. Recent researches have developed mechanisms to assist researchers in understanding the structure of scientific fields. However, those mechanisms focus on recommending relevant articles to researchers or supporting them in understanding how a scientific field is organized considering documents that belong to it. These methods limit the field understanding, not allowing researchers to study the underlying concepts and relations that compose a scientific field and its sub-areas. This Ms.c. thesis proposes a framework to structure, analyze, and track the evolution of a scientific field at a concept level. Given a set of textual documents as research papers, it first structures a scientific field as a knowledge graph using its detected concepts as vertices. Then, it automatically identifies the field's main sub-areas, extracts their keyphrases, and studies their relations. Our framework enables to represent the scientific field in distinct time-periods. It allows to compare its representations and identify how the field's areas changed over time. We evaluate each step of our framework representing and analyzing scientific data from distinct fields of knowledge in case studies. Our findings indicate the success in detecting the sub-areas based on the generated graph from natural language documents. We observe similar outcomes in the different case studies, indicating that our approach is applicable to distinct domains. This research also contributes with a web-based software tool that allows researchers to use the proposed framework graphically. By using our application, researchers can have an overview analysis of how a scientific field is structured and how it evolved.},
author = {Tosi, Mauro Dalle Lucca},
title = {Constructing Knowledge Graphs from Textual Documents for Scientific Literature Analysis},
school = {University of Campinas - Institute of Computing},
keyword = {Graph;Complex networks;Centrality;Semantic computing;Scientific knowledge},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2020/DissertacaoMauro.pdf}
year = {2020},
date = {2020-03-09}
}
The amount of publications a researcher must absorb has been increasing over the last years. Consequently, among so many options, it is hard for them to identify interesting documents to read related to their studies. Researchers usually search for review articles to understand how a scientific field is organized and to study its state of the art. This option can be unavailable or outdated depending on the studied area. Usually, they have to do such laborious task of background research manually. Recent researches have developed mechanisms to assist researchers in understanding the structure of scientific fields. However, those mechanisms focus on recommending relevant articles to researchers or supporting them in understanding how a scientific field is organized considering documents that belong to it. These methods limit the field understanding, not allowing researchers to study the underlying concepts and relations that compose a scientific field and its sub-areas. This Ms.c. thesis proposes a framework to structure, analyze, and track the evolution of a scientific field at a concept level. Given a set of textual documents as research papers, it first structures a scientific field as a knowledge graph using its detected concepts as vertices. Then, it automatically identifies the field's main sub-areas, extracts their keyphrases, and studies their relations. Our framework enables to represent the scientific field in distinct time-periods. It allows to compare its representations and identify how the field's areas changed over time. We evaluate each step of our framework representing and analyzing scientific data from distinct fields of knowledge in case studies. Our findings indicate the success in detecting the sub-areas based on the generated graph from natural language documents. We observe similar outcomes in the different case studies, indicating that our approach is applicable to distinct domains. This research also contributes with a web-based software tool that allows researchers to use the proposed framework graphically. By using our application, researchers can have an overview analysis of how a scientific field is structured and how it evolved.
|
2019 |
Bonacin, Rodrigo;
Dos Reis, Julio Cesar;
Baranauskas, Maria Cecília Calani
Universal Participatory Design: Achievements and Challenges (journal)
Journal on Interactive Systems,
SBC,
journal,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Universal Access,
Participatory Design,
Accessibility,
Democracy in Design
)
@article{bonacin2019u,
author = {Rodrigo Bonacin and Julio Cesar Dos Reis and Maria Cecília Baranauskas},
title = {Universal Participatory Design: Achievements and Challenges},
journal = {Journal on Interactive Systems},
volume = {10},
number = {1},
year = {2019},
keywords = {},
abstract = {According to the principles of participatory design, a genuine democratic process requires effective participation of all affected people in the design process; this must include affected disabled users. However, user participation entails complex problems, which are aggravated by conditions of illiteracy and/or aging. This article presents the concept of Universal Participatory Design, a design philosophy and practice that aims to be inclusive during the design process, and which has a positive result for all. We first conducted a review of the literature to understand the limits of the relationships between participatory design and universal design. This paper then addresses some of the challenges to achieve Universal Participatory Design (UPD) by juxtaposing deficits observed in the literature with issues we experienced during two research projects. We discuss the key components of Participatory Design and its relationship to UPD, and establish a research agenda that aims to conceptualize and investigate participatory design with universal access. Our findings indicate the need for flexible design methods, adaptable artifacts, and positive designers’ attitudes when encountering unexpected situations.},
issn = {2236-3297},
url = {https://sol.sbc.org.br/journals/index.php/jis/article/view/714}
}
According to the principles of participatory design, a genuine democratic process requires effective participation of all affected people in the design process; this must include affected disabled users. However, user participation entails complex problems, which are aggravated by conditions of illiteracy and/or aging. This article presents the concept of Universal Participatory Design, a design philosophy and practice that aims to be inclusive during the design process, and which has a positive result for all. We first conducted a review of the literature to understand the limits of the relationships between participatory design and universal design. This paper then addresses some of the challenges to achieve Universal Participatory Design (UPD) by juxtaposing deficits observed in the literature with issues we experienced during two research projects. We discuss the key components of Participatory Design and its relationship to UPD, and establish a research agenda that aims to conceptualize and investigate participatory design with universal access. Our findings indicate the need for flexible design methods, adaptable artifacts, and positive designers’ attitudes when encountering unexpected situations.
|
Tosi, Mauro Dalle Lucca;
Dos Reis, Julio Cesar
C-Rank: A Concept Linking Approach to Unsupervised Keyphrase Extraction (conference)
Research Conference on Metadata and Semantics Research,
Springer,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Keyphrase extraction,
Complex networks,
Semantic annotation
)
@inproceedings{tosi2019c,
title={C-Rank: A Concept Linking Approach to Unsupervised Keyphrase Extraction},
author={Tosi, Mauro Dalle Lucca and dos Reis, Julio Cesar},
booktitle={Research Conference on Metadata and Semantics Research},
pages={236--247},
year={2019},
organization={Springer}
}
Keyphrase extraction is the task of identifying a set of phrases that best represent a natural language document. It is a fundamental and challenging task that assists publishers to index and recommend relevant documents to readers. In this article, we introduce C-Rank, a novel unsupervised approach to automatically extract keyphrases from single documents by using concept linking. Our method explores Babelfy to identify candidate keyphrases, which are weighted based on heuristics and their centrality inside a co-occurrence graph where keyphrases appear as vertices. It improves the results obtained by graph-based techniques without training nor background data inserted by users. Evaluations are performed on SemEval and INSPEC datasets, producing competitive results with state-of-the-art tools. Furthermore, C-Rank generates intermediate structures with semantically annotated data that can be used to analyze larger textual compendiums, which might improve domain understatement and enrich textual representation methods.
|
Destro, Juliana Medeiros;
dos Reis, Julio Cesar;
Torres, Ricardo da S;
Ricarte, Ivan
Evolution-based refinement of cross-language ontology alignments (simpósio)
Anais Principais do XXXIV Simpósio Brasileiro de Banco de Dados,
SBC,
simpósio,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Ontology Alignment,
information interconnectivity,
ontology evolution,
refinement actions,
semantic relations
)
"
@inproceedings{destro2019e,
author = {Juliana Destro and Julio César dos Reis and Ricardo Torres and Ivan Ricarte},
title = {Evolution-based Refinement of Cross-language Ontology Alignments},
booktitle = {Anais Principais do XXXIV Simpósio Brasileiro de Banco de Dados},
location = {Fortaleza},
year = {2019},
keywords = {},
issn = {0000-0000},
pages = {61--72},
publisher = {SBC},
address = {Porto Alegre, RS, Brasil},
doi = {10.5753/sbbd.2019.8808},
url = {https://sol.sbc.org.br/index.php/sbbd/article/view/8808}
}
Ontology alignment plays a key role for information interconnectivity between computational systems relying on ontologies described in different natural languages. Existing approaches for ontology matching usually provide equivalent type of relation in the generated mappings. In this article, we propose a refinement technique to enable the update of the semantic type of the mapping such as “is-a”, “part-of”, etc. Our approach relies on information from the ontology evolution to apply refinement actions. We formalize the refinement actions and procedures, as well as apply the proposal in application scenarios.
|
Regino, André Gomes;
Matsoui, Julio Kiyoshi Rodrigues;
Dos Reis, Julio Cesar;
Bonacin, Rodrigo;
Morshed, Ahsan;
Sellis, Timos
Understanding Link Changes in LOD via the Evolution of Life Science Datasets (conference)
Proceedings of the Workshop on Semantic Web Solutions for Large-Scale Biomedical Data Analytics co-located with 18th International Semantic Web Conference {(ISWC} 2019), Auckland, New Zealand, October 27th, 2019,
CEUR-WS.org,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
LOD,
Web of Data evolution,
Link evolution,
Change Operations,
Link changes,
Link Repair,
RDF life science datasets
)
@inproceedings{regino2019,
author = {Andr{\'{e}} Gomes Regino and
Julio Kiyoshi Rodrigues Matsoui and
J{\'{u}}lio C{\'{e}}sar dos Reis and
Rodrigo Bonacin and
Ahsan Morshed and
Timos Sellis},
editor = {Ali Hasnain and
V{\'{\i}}t Nov{\'{a}}cek and
Michel Dumontier and
Dietrich Rebholz{-}Schuhmann},
title = {Understanding Link Changes in {LOD} via the Evolution of Life Science
Datasets},
booktitle = {Proceedings of the Workshop on Semantic Web Solutions for Large-Scale
Biomedical Data Analytics co-located with 18th International Semantic
Web Conference {(ISWC} 2019), Auckland, New Zealand, October 27th,
2019},
series = {{CEUR} Workshop Proceedings},
volume = {2477},
pages = {40--54},
publisher = {CEUR-WS.org},
year = {2019},
urlPaper = {http://ceur-ws.org/Vol-2477/paper\_4.pdf},
urlWeb = {http://ceur-ws.org/Vol-2477/},
biburl = {https://dblp.org/rec/conf/semweb/ReginoMRBMS19.bib},
bibsource = {dblp computer science bibliography, https://dblp.org},
}
RDF data has been extensively deployed for the interlinking of health-related data in a structured way. The definition of link statements between distinct resources plays a key role to interconnect several life science repositories. However, RDF assertions are subject to change, which can affect existing links. In this article, we conduct extensive experiments to understand the evolution of links in the Linked Open Data (LOD). The objective is to empirically associate changes in the semantic definition of data resources with modifications observed in predefined links. We consider two versions of the Agrovoc RDF repository to calculate different types of change operations and associate them to link change actions. Obtained results indicate the existence of the cases investigated in this study. We demonstrate that RDF changes impact the evolution of established links.
|
Yamamoto, V.E.;
dos Reis, J.C.
Updating ontology alignments in life sciences based on new concepts and their context (conference)
Workshop on Semantic Web Solutions for Large-Scale Biomedical Data Analytics - 18th International Semantic Web Conference (ISWC 2019),
CEUR-WS,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
ontology alignment,
ontology evolution,
mapping refinement,
concept addition,
biomedical vocabulary
)
@CONFERENCE{Yamamoto2019u,
author={Yamamoto, V.E. and dos Reis, J.C.},
title={Updating ontology alignments in life sciences based on new concepts and their context},
journal={CEUR Workshop Proceedings},
year={2019},
volume={2477},
pages={16-30},
url={https://www.scopus.com/inward/record.uri?eid=2-s2.0-85074556967&partnerID=40&md5=c3faad9a24386c21b55200bb5356a84d},
abstract={Ontologies and their associated mappings in life sciences play a central role in several semantic-enabled tasks. However, the continuous evolution of these ontologies requires updating existing concept alignments. Whereas mapping maintenance techniques have mostly handled revision and removal type of ontology changes, the addition of concepts demands further studies. This article proposes a technique to refine a set of established mappings based on the evolution of biomedical ontologies. We investigate ways of suggesting correspondences with the new version of the ontology without applying a matching operation to the whole set of ontology entities. Obtained results explore the neighbourhood of concepts in the alignment process to update mapping sets. Our experimental evaluation with several versions of aligned biomedical ontologies shows the effectiveness in considering the context of new concepts. Copyright ©2019 for this paper by its authors.},
author_keywords={Biomedical vocabulary; Concept addition; Mapping refinement; Ontology alignment; Ontology evolution},
publisher={CEUR-WS},
document_type={Conference Paper},
source={Scopus}
}
Ontologies and their associated mappings in life sciences play a central role in several semantic-enabled tasks. However, the continuous evolution of these ontologies requires updating existing concept alignments. Whereas mapping maintenance techniques have mostly handled revision and removal type of ontology changes, the addition of concepts demands further studies. This article proposes a technique to refine a set of established mappings based on the evolution of biomedical ontologies. We investigate ways of suggesting correspondences with the new version of the ontology without applying a matching operation to the whole set of ontology entities. Obtained results explore the neighbourhood of concepts in the alignment process to update mapping sets. Our experimental evaluation with several versions of aligned biomedical ontologies shows the effectiveness in considering the context of new concepts.
|
do Espírito Santo, Jacqueline M.;
de Paula, Erich Vinicius;
Medeiros, Claudia Bauzer
Exploring Semantics in Clinical Data Interoperability (conference)
Advances in Conceptual Modeling,
Springer International Publishing,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
interoperability,
Medical Knowledge Organizations Systems,
semantic query,
query expansion
)
@inproceedings{Santo2019,
title={Exploring Semantics in Clinical Data Interoperability},
author={Jacqueline do Espírito Santo and Erich Vinicius de Paula and Claudia Bauzer Medeiros},
booktitle={Advances in Conceptual Modeling},
pages={201-210},
year={2019},
organization={Springer International Publishing}
}
|
Rossanez, A.;
dos Reis, J.C.
Generating knowledge graphs from scientific literature of degenerative diseases (conference)
International Workshop on Semantics-Powered Data Mining and Analytics (SEPDA 2019) - 18th International Semantic Web Conference (ISWC 2019),
CEUR-WS,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Knowledge Graphs,
RDF triples,
Ontologies,
Information Extraction
)
@CONFERENCE{Rossanez2019g,
author={Rossanez, A. and dos Reis, J.C.},
title={Generating knowledge graphs from scientific literature of degenerative diseases},
journal={CEUR Workshop Proceedings},
year={2019},
volume={2427},
pages={12-23},
url={https://www.scopus.com/inward/record.uri?eid=2-s2.0-85071743429&partnerID=40&md5=9a5a3c77214912a8d4b7132f6a2ab283},
abstract={Degenerative diseases, such as the Alzheimer’s Disease, can be very serious and life-threatening. As the scientific community strives to fully understand their exact root causes and advance their research on the domain, a massive amount of knowledge is generated. To represent and link all this knowledge, we propose the generation of knowledge graphs from the scientific literature. We aim to provide researchers the ability to relate their new discoveries with the current knowledge and possibly formulate new hypotheses to further advance the research. In this paper, we describe a method to extract information from scientific literature for generating a knowledge graph reusing existing domain ontologies. We demonstrate the effectiveness of our method by generating knowledge graphs from a set of abstracts of scientific papers on Alzheimer’s Disease. Copyright © 2019 for this paper by its authors.},
author_keywords={Information extraction; Knowledge graphs; Ontologies; RDF triples},
publisher={CEUR-WS},
document_type={Conference Paper},
source={Scopus}
}
Degenerative diseases, such as the Alzheimer’s Disease, can be very serious and life-threatening. As the scientific community strives to fully understand their exact root causes and advance their research on the domain, a massive amount of knowledge is generated. To represent and link all this knowledge, we propose the generation of knowledge graphs from the scientific literature. We aim to provide researchers the ability to relate their new discoveries with the current knowledge and possibly formulate new hypotheses to further advance the research. In this paper, we describe a method to extract information from scientific literature for generating a knowledge graph reusing existing domain ontologies. We demonstrate the effectiveness of our method by generating knowledge graphs from a set of abstracts of scientific papers on Alzheimer’s Disease.
|
Destro, J.M.;
dos Reis, J.C.;
da Silva Torres, R.;
Ricarte;
I."
Ontology changes-driven semantic refinement of cross-language biomedical ontology alignments (conference)
International Workshop on Semantic Web Solutions for Large-Scale Biomedical Data Analytics - 18th International Semantic Web Conference (ISWC 2019),
CEUR-WS,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
mapping refinement,
ontology evolution,
cross-language alignment
)
@CONFERENCE{Destro2019o,
author={Destro, J.M. and dos Reis, J.C. and da Silva Torres, R. and Ricarte, I.},
title={Ontology changes-driven semantic refinement of cross-language biomedical ontology alignments},
journal={CEUR Workshop Proceedings},
year={2019},
volume={2477},
pages={31-15},
url={https://www.scopus.com/inward/record.uri?eid=2-s2.0-85074606707&partnerID=40&md5=f22a2f846062015df7a6df66ee247be6},
abstract={Biomedical computational systems benefits from the use of ontologies. However, interconnectivity between these systems is a challenge, specially when the ontologies supporting each system are described in different natural languages. Ontology alignment plays a key role in data exchange. Existing ontology matching approaches usually provide only equivalent type of relation in the generated mappings. In this article, we propose a refinement technique to enable the update of the semantic type of the mapping beyond equivalence. Our approach relies on information from the ontology evolution. Our evaluation considered LOINC releases in different languages. The results demonstrate the usefulness of ontology evolution changes to support the process of mapping refinement. Copyright ©2019 for this paper by its authors.},
author_keywords={Cross-language alignment; Mapping refinement; Ontology evolution},
publisher={CEUR-WS},
document_type={Conference Paper},
source={Scopus}
}
Biomedical computational systems benefits from the use of ontologies. However, interconnectivity between these systems is a challenge, specially when the ontologies supporting each system are described in different natural languages. Ontology alignment plays a key role in data exchange. Existing ontology matching approaches usually provide only equivalent type of relation in the generated mappings. In this article, we propose a refinement technique to enable the update of the semantic type of the mapping beyond equivalence. Our approach relies on information from the ontology evolution. Our evaluation considered LOINC releases in different languages. The results demonstrate the usefulness of ontology evolution changes to support the process of mapping refinement.
|
Destro, Juliana Medeiros;
Vargas, Javier A;
dos Reis, Julio Cesar;
Torres, Ricardo Da Silva
EVOCROS: Results for OAEI 2019 (conference)
The Fourteenth International Workshop on Ontology Matching - 18th International Semantic Web Conference ISWC-2019,
CEUR-WS,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
cross-lingual matching,
semantic matching,
background knowledge,
ranking aggregation
)
@inproceedings{destro2019evocros,
title={EVOCROS: Results for OAEI 2019},
author={Destro, Juliana Medeiros and Vargas, Javier A and dos Reis, Julio Cesar and Torres, Ricardo Da Silva},
year={2019},
organization={CEUR Workshop Proceedings}
}
This paper describes the updates in EVOCROS, a crosslingual ontology alignment system suited to create mappings between ontologies described in different natural language. Our tool combines syntactic and semantic similarity measures with information retrieval techniques. The semantic similarity is computed via NASARI vectors used together with BabelNet, which is a domain-neutral semantic network. In particular, we investigate the use of rank aggregation techniques in the cross-lingual ontology alignment task. The tool employs automatic translation to a pivot language to consider the similarity. EVOCROS was tested and obtained high quality alignment in the Multifarm dataset. We discuss the experimented configurations and the achieved results in OAEI 2019. This is our second participation in OAEI.
|
Santos, Andressa C. dos;
Muriana, Luã M.;
Pimenta, Josiane R. O. G.;
Silva, José V. da;
Moreira, Eliana A.;
Reis, Julio C. dos
Investigating Aspects of Affectibility for Universal Access in Socioenactive System Scenarios (conference)
Proceedings of the 18th Brazilian Symposium on Human Factors in Computing Systems,
Association for Computing Machinery,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Affectibility,
Socioenactive,
Universal Access,
Emotion,
PAff
)
@inproceedings{AdosSantos2019,
author = {Santos, Andressa C. dos and Muriana; Lu\~{a} M.; Pimenta Josiane R. O. G.; Silva, Jos\'{e} V. da; Moreira, Eliana A.; Reis, Julio C. dos},
title = {Investigating Aspects of Affectibility for Universal Access in Socioenactive System Scenarios},
year = {2019},
isbn = {9781450369718},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3357155.3358475},
doi = {10.1145/3357155.3358475},
booktitle = {Proceedings of the 18th Brazilian Symposium on Human Factors in Computing Systems},
articleno = {33},
numpages = {11},
keywords = {universal access; socioenactive; affectibility; emotion; PAff},
location = {Vit\'{o}ria, Esp\'{\i}rito Santo, Brazil},
}
The design process focused on universal access must be guided by a set of relevant recommendations to improve interaction design and evaluation. The interactions in socioenactive systems intensify the emphasis in social, corporal and affective aspects. This article develops an affective study in the context of socioenactive scenarios. Our objective is to analyse the Design Principles of Affectibility (PAff) towards universal access into socioenactive systems. The analysis was conducted in a workshop realized at a hospital where the participants involved kids who were under rehabilition regarding face and skull disorders. Relying on the analisys applying PAff, we generated a set of recomendations which might be useful to designers for promoting universal access in socioenactive systems.
|
L. Virginio;
J. C. dos Reis
Finding Relations Between Requirements for Healthcare Information Systems Use in Hospitals: A Study on EMRAM and JCI (conference)
2019 12th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI),
IEEE,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Electronic Medical Record Adoption Model,
Healthcare Information System,
Joint Commission International
)
@INPROCEEDINGS{virginio2019f,
author={L. {Virginio} and J. C. {dos Reis}},
booktitle={2019 12th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)},
title={Finding Relations Between Requirements for Healthcare Information Systems Use in Hospitals: A Study on EMRAM and JCI},
year={2019},
volume={},
number={},
pages={1-6},
abstract={EMRAM is a maturity model which goal is to measure the adoption and utilization of HIS functions in hospitals. However, maturity models regarding HIS in healthcare settings are not comprehensive and lack detail. Therefore, it is important for EMRAM to learn from other sources of HIS evaluation in healthcare organizations, such as JCI. In addition, it is important to understand how to adapt processes and implement technologies that can ensure compliance with the requirements established by both bodies. In this paper, we carry out an evaluation to identify relations between JCI and EMRAM requirements. We extracted EMRAM and JCI requirements and identified relations between them, with further validation by specialist. We identified 127 relations between JCI requirements and EMRAM and/or HIS. Six JCI requirements specifically related to IT are not currently required by EMRAM and could be used to promote the evolution of the maturity model. We also identified the JCI requirements that can be supported by HIS, which can be used by healthcare organizations to facilitate the management of JCI and EMRAM conformance.},
keywords={health care;hospitals;medical information systems;healthcare information systems;hospitals;maturity model;healthcare organizations;EMRAM requirements;JCI requirements;HIS evaluation;Healthcare Information System;Electronic Medical Record Adoption Model;Joint Commission International},
doi={10.1109/CISP-BMEI48845.2019.8965782},
ISSN={},
month={Oct}
}
EMRAM is a maturity model which goal is to measure the adoption and utilization of HIS functions in hospitals. However, maturity models regarding HIS in healthcare settings are not comprehensive and lack detail. Therefore, it is important for EMRAM to learn from other sources of HIS evaluation in healthcare organizations, such as JCI. In addition, it is important to understand how to adapt processes and implement technologies that can ensure compliance with the requirements established by both bodies. In this paper, we carry out an evaluation to identify relations between JCI and EMRAM requirements. We extracted EMRAM and JCI requirements and identified relations between them, with further validation by specialist. We identified 127 relations between JCI requirements and EMRAM and/or HIS. Six JCI requirements specifically related to IT are not currently required by EMRAM and could be used to promote the evolution of the maturity model. We also identified the JCI requirements that can be supported by HIS, which can be used by healthcare organizations to facilitate the management of JCI and EMRAM conformance.
|
Muriana, Luã Marcelo;
Tosi, Mauro Dalle Lucca;
dos Reis, Julio Cesar
Aprendendo via o Papel de Designer e de Stakeholder: Uma Estratégia Pedagógica para Ensino de IHC (simpósio)
Anais Estendidos do XVIII Simpósio Brasileiro sobre Fatores Humanos em Sistemas Computacionais,
SBC,
simpósio,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Ensino Baseado em Projeto,
Papel de Designer,
Ensino de IHC,
Design Centrado no Usuário
)
@inproceedings{muriana2019a,
author = {Luã Marcelo Muriana and Mauro Dalle Lucca Tosi and Julio Cesar dos Reis},
title = {Aprendendo via o Papel de Designer e de Stakeholder: Uma Estratégia Pedagógica para Ensino de IHC},
booktitle = {Anais Estendidos do XVIII Simpósio Brasileiro sobre Fatores Humanos em Sistemas Computacionais},
location = {Vitória},
year = {2019},
keywords = {},
issn = {2177-9384},
pages = {88--93},
publisher = {SBC},
address = {Porto Alegre, RS, Brasil},
doi = {10.5753/ihc.2019.8406},
url = {https://sol.sbc.org.br/index.php/ihc_estendido/article/view/8406}
}
O ensino de IHC demanda práticas colaborativas em grupos que experienciam diferentes papéis no processo de design de software interativos. Neste artigo avaliamos o uso de uma estratégia pedagógica no ensino de IHC em que alunos desenvolvem tanto o papel de designers quanto de stakeholders em uma abordagem de aprendizagem baseada em projeto. Os alunos são divididos em grupos em que a princípio escolhem os temas dos projetos para trabalharem como designers. Entretanto, para simular uma experiência mais próxima do mundo real, em que a escolha de temas de projetos não é possível, os grupos são parificados e têm seus temas trocados. Assim, cada grupo é designer de um projeto com tema não familiar (não definido pelos mesmos) e desenvolve papel de “cliente” do tema por ele escolhido. Para avaliar a percepção de aprendizagem dos alunos em relação à estratégia de ensino elaborada, realizamos um questionário aplicado à turmas de Ciência da Computação e Engenharia da Computação. Com base em 123 respostas analisadas, 96% dos alunos afirmaram que o projeto os auxiliou positivamente no aprendizado de IHC.
|
Borges, Marcos Vinícius Macêdo;
dos Reis, Julio Cesar
Semantic-Enhanced Recommendation of Video Lectures (conference)
2019 IEEE 19th International Conference on Advanced Learning Technologies (ICALT),
IEEE,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Learning support,
Ontology,
Recommendation System,
Semantic Annotation
)
@INPROCEEDINGS{borges2019s,
author={M. V. {Macêdo Borges} and J. C. {dos Reis}},
booktitle={2019 IEEE 19th International Conference on Advanced Learning Technologies (ICALT)},
title={Semantic-Enhanced Recommendation of Video Lectures},
year={2019},
volume={2161-377X},
number={},
pages={42-46},
abstract={Learning support systems explore several audio-visual resources to consider individual needs and learning styles aiming to stimulate learning experiences. However, the large amount of educational content in different formats and the possibility of making them available in a fragmented way turns difficult the tasks of accessing these resources and understanding the concepts under study. Although literature has proposed approaches to explore explicit semantic representation through artifacts such as ontologies in learning support systems, this research line still requires further investigation efforts. In this paper, we propose a method for recommending educational content by exploring the use of semantic annotations over textual transcriptions from video lessons. Our investigation addresses the difficulties in extracting entities from natural language texts as subtitles of videos. We report on major challenges to achieve the representation of video transcriptions as semantic annotations for automatic recommendation of educational content.},
keywords={computer aided instruction;ontologies (artificial intelligence);text analysis;semantic-enhanced recommendation;video lectures;support systems;audio-visual resources;individual needs;learning styles;educational content;explicit semantic representation;research line;investigation efforts;semantic annotations;video lessons;video transcriptions;automatic recommendation;textual transcriptions;learning experiences;natural language texts;Ontologies;Semantics;Annotations;Task analysis;Computer science;Indexing;Ontology, Learning support, Recommendation System, Semantic Annotation},
doi={10.1109/ICALT.2019.00013},
ISSN={2161-377X},
month={July}
}
Learning support systems explore several audio-visual resources to consider individual needs and learning styles aiming to stimulate learning experiences. However, the large amount of educational content in different formats and the possibility of making them available in a fragmented way turns difficult the tasks of accessing these resources and understanding the concepts under study. Although literature has proposed approaches to explore explicit semantic representation through artifacts such as ontologies in learning support systems, this research line still requires further investigation efforts. In this paper, we propose a method for recommending educational content by exploring the use of semantic annotations over textual transcriptions from video lessons. Our investigation addresses the difficulties in extracting entities from natural language texts as subtitles of videos. We report on major challenges to achieve the representation of video transcriptions as semantic annotations for automatic recommendation of educational content.
|
Victorelli, Eliane Zambon;
dos Reis, Julio Cesar;
Santos, Antonio Alberto Souza;
Schiozer, Denis José
Participatory Evaluation of Human-Data Interaction Design Guidelines (conference)
Human-Computer Interaction - INTERACT 2019,
Springer,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Human-data interaction,
Design guidelines,
Design evaluation,
Participatory design,
Visual analytics,
Oil reservoirs
)
@InProceedings{victorelli2019p,
author="Victorelli, Eliane Zambon
and Reis, Julio Cesar dos
and Santos, Antonio Alberto Souza
and Schiozer, Denis Jos{\'e}",
editor="Lamas, David
and Loizides, Fernando
and Nacke, Lennart
and Petrie, Helen
and Winckler, Marco
and Zaphiris, Panayiotis",
title="Participatory Evaluation of Human-Data Interaction Design Guidelines",
booktitle="Human-Computer Interaction -- INTERACT 2019",
year="2019",
publisher="Springer International Publishing",
address="Cham",
pages="475--494",
abstract="The design of visual analytics tools for facilitating human-data interaction (HDI) plays a key role to help people identifying useful knowledge from large masses of data. Designing data visualization based on guidelines is relevant. However, it is necessary to further promote the engagement of people in evaluation activities in the design process. Stakeholders need to comprehend the guidelines to help with the evaluation results and design decisions. In this paper, we propose participatory evaluation practices based on HDI design guidelines. The practices aim to create the conditions to participants from any profile collaborate with the design guidelines evaluation. The practices were used on a design problem involving interactions with coordinated visualization. The context of application was a visual analytic tool supporting decisions related to the production strategy in oil reservoirs with the participation of key stakeholders. The results indicate that participants were able to understand the design guidelines and took advantage from them in the design decisions.",
isbn="978-3-030-29381-9"
}
The design of visual analytics tools for facilitating human-data interaction (HDI) plays a key role to help people identifying useful knowledge from large masses of data. Designing data visualization based on guidelines is relevant. However, it is necessary to further promote the engagement of people in evaluation activities in the design process. Stakeholders need to comprehend the guidelines to help with the evaluation results and design decisions. In this paper, we propose participatory evaluation practices based on HDI design guidelines. The practices aim to create the conditions to participants from any profile collaborate with the design guidelines evaluation. The practices were used on a design problem involving interactions with coordinated visualization. The context of application was a visual analytic tool supporting decisions related to the production strategy in oil reservoirs with the participation of key stakeholders. The results indicate that participants were able to understand the design guidelines and took advantage from them in the design decisions.
|
Lombello, Luma Oliveira;
dos Reis, Julio Cesar;
Bonacin, Rodrigo
Soft Ontologies as Fuzzy RDF Statements (conference)
2019 IEEE 28th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE),
IEEE,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Fuzzy,
Linked Data,
Ontology,
RDF,
Soft Ontologies,
Triples,
Triplification
)
@INPROCEEDINGS{lombello2019s,
author={L. O. {Lombello} and J. C. {dos Reis} and R. {Bonacin}},
booktitle={2019 IEEE 28th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE)},
title={Soft Ontologies as Fuzzy RDF Statements},
year={2019},
volume={},
number={},
pages={289-294},
abstract={Soft ontologies enable knowledge representation in a flexible way because they are more susceptible to changes in their interrelationships over time. The transformation of a nonstandardized model of information into a computer-interpretable representation is not an easy task. Our investigation expands the concept of soft ontologies to fuzzy data represented as fuzzy RDF triples. In this paper, we elaborate a process that transforms a soft ontology implemented in the form of a matrix of probabilities into a fuzzy RDF dataset. The study presents the way the matrix of probabilities are used in the representation and how the data elements are triplified exploring fuzzy characteristics. We apply the proposal in an experimental scenario, by constructing a soft ontology which expresses a repertoire of actions in an mBot robot. Then, the facts are triplified to fuzzy RDF statements. Our results present original aspects related to the transformation of soft ontologies to fuzzy RDF triples.},
keywords={data handling;fuzzy set theory;knowledge representation;matrix algebra;ontologies (artificial intelligence);probability;soft ontology;fuzzy RDF statements;fuzzy data;fuzzy RDF triples;fuzzy RDF dataset;matrix of probabilities;data elements;fuzzy characteristics;mBot robot;Ontologies;Resource description framework;Automobiles;Proposals;Vocabulary;Data models;Computational modeling;Ontology, Soft Ontologies, Fuzzy, RDF, Triples, Triplification, Linked Data},
doi={10.1109/WETICE.2019.00067},
ISSN={2641-8169},
month={June},}
Soft ontologies enable knowledge representation in a flexible way because they are more susceptible to changes in their interrelationships over time. The transformation of a nonstandardized model of information into a computer-interpretable representation is not an easy task. Our investigation expands the concept of soft ontologies to fuzzy data represented as fuzzy RDF triples. In this paper, we elaborate a process that transforms a soft ontology implemented in the form of a matrix of probabilities into a fuzzy RDF dataset. The study presents the way the matrix of probabilities are used in the representation and how the data elements are triplified exploring fuzzy characteristics. We apply the proposal in an experimental scenario, by constructing a soft ontology which expresses a repertoire of actions in an mBot robot. Then, the facts are triplified to fuzzy RDF statements. Our results present original aspects related to the transformation of soft ontologies to fuzzy RDF triples.
|
Borges, Marcos Vinicius Macedo;
dos Reis, Julio Cesar;
Gribeler, Guilherme Pereira
Empirical Analysis of Semantic Metadata Extraction from Video Lecture Subtitles} (conference)
2019 IEEE 28th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE),
IEEE,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Metadata extraction,
Ontology,
Semantic Annotation,
Video lectures
)
@INPROCEEDINGS{borges2019e,
author={M. V. {Macedo Borges} and J. C. {dos Reis} and G. {Pereira Gribeler}},
booktitle={2019 IEEE 28th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE)},
title={Empirical Analysis of Semantic Metadata Extraction from Video Lecture Subtitles},
year={2019},
volume={},
number={},
pages={301-306},
abstract={Video lectures improve the learning experiences considering individual's needs and learning styles. However, the large amount of educational content and their availability in a fragmented way turns difficult the tasks of accessing these resources and understanding the concepts under study. Extracting relevant information from video lectures can be useful for recommendation purposes and for helping the interpretation of a concept in an exact moment of a lecture. The extraction of semantic metadata from a video natural language subtitle involves challenges in dealing with informal aspects of language and the detection of semantic classes from the text. In this paper, we conduct an empirical analysis of semantic annotation approaches supported by ontologies in the extraction of relevant metadata from textual transcriptions of video lectures in Computer Science. The obtained results indicate that existing tools can be useful for the studied task and the video lecture semantic metadata extraction process is highly influenced by the underlying ontologies.},
keywords={computer aided instruction;computer science education;information retrieval;interactive video;meta data;multimedia computing;natural language processing;text analysis;video lecture subtitles;video lectures;video natural language subtitle;semantic classes;semantic annotation approaches;semantic metadata extraction process;learning experiences;learning styles;educational content;ontologies;textual transcriptions;computer science;Ontologies;Semantics;Tools;Metadata;Computer science;Task analysis;Natural languages;Ontology, Semantic Annotation, Metadata extraction, Video lectures},
doi={10.1109/WETICE.2019.00069},
ISSN={2641-8169},
month={June}
}
Video lectures improve the learning experiences considering individual's needs and learning styles. However, the large amount of educational content and their availability in a fragmented way turns difficult the tasks of accessing these resources and understanding the concepts under study. Extracting relevant information from video lectures can be useful for recommendation purposes and for helping the interpretation of a concept in an exact moment of a lecture. The extraction of semantic metadata from a video natural language subtitle involves challenges in dealing with informal aspects of language and the detection of semantic classes from the text. In this paper, we conduct an empirical analysis of semantic annotation approaches supported by ontologies in the extraction of relevant metadata from textual transcriptions of video lectures in Computer Science. The obtained results indicate that existing tools can be useful for the studied task and the video lecture semantic metadata extraction process is highly influenced by the underlying ontologies.
|
Saraiva, Márcio de Carvalho
Relationships among educational materials through the extraction of implicit topics (phdthesis)
phdthesis,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Teaching materials,
Education - Technological innovations,
Data mining,
Classification
)
@phdthesis{Saraiva2019,
abstract= {Digital educational documents are growing in size and variety, catering to an increasingly heterogeneous public. As a consequence, students are facing difficulties to find their way through such material. Several scientists have created online repositories to store and facilitate access to these documents. Unfortunately, in most such repositories documents are stored in a haphazard way. This hampers distinguishing among contents of these materials, as well as their retrieval. As a consequence, students interested in accessing relevant material revert to (traditional) Web search engines, or to browsing through one specific repository. In most cases, the results of invoking those search engines are presented as a set (or disjunction) of potentially interesting documents, which may not be adapted to the learning purpose. One of the initiatives that have emerged to solve this problem involves the use of automatic classification algorithms, e.g. Topic Modeling and Topic Labeling. However, them remains the difficulty to analyze implicit relationships among topics of materials and lecturers from different places, even within a single repository. Moreover, these solutions have not been applied to sets of documents with different formats, and do not take advantage of additional information - e.g., metadata to extract topics. This work presents CIMAL, a framework for flexible analysis of educational material repositories; CIMAL combines semantic classification, taxonomies and graph structures to extract topics and their multiple relationships. We validated our proposal through a prototype that uses real materials from Coursera (Johns Hopkins University and University of Michigan) and Higher Education Institute, from Brazil. As far as we known, this is the first time that both slide and video features guide text analysis, topic classification techniques and relationship discovery among documents. The elicitation of topics covered in various educational documents and of their potential relationships can support teachers and students in undertaking academic activities that are more dynamic than conventional ones – e.g., in which new relationships are found between different subjects from different sources. This can also make it easier to search the most appropriate items in educational repositories to learn new concepts, enhancing the development of new courses. From the computational point of view, this research contributes to the improvement of techniques for handling unstructured documents and documents of different formats.},
author = {Márcio de Carvalho Saraiva},
date = {2019-08-14},
keyword = {Teaching materials;Education - Technological innovations;Data mining;Classification},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2019/TeseMarcio.pdf},
school = {University of Campinas - Institute of Computing},
title = {Relationships among educational materials through the extraction of implicit topics},
year = {2019}
}
Digital educational documents are growing in size and variety, catering to an increasingly heterogeneous public. As a consequence, students are facing difficulties to find their way through such material. Several scientists have created online repositories to store and facilitate access to these documents. Unfortunately, in most such repositories documents are stored in a haphazard way. This hampers distinguishing among contents of these materials, as well as their retrieval. As a consequence, students interested in accessing relevant material revert to (traditional) Web search engines, or to browsing through one specific repository. In most cases, the results of invoking those search engines are presented as a set (or disjunction) of potentially interesting documents, which may not be adapted to the learning purpose. One of the initiatives that have emerged to solve this problem involves the use of automatic classification algorithms, e.g. Topic Modeling and Topic Labeling. However, them remains the difficulty to analyze implicit relationships among topics of materials and lecturers from different places, even within a single repository. Moreover, these solutions have not been applied to sets of documents with different formats, and do not take advantage of additional information - e.g., metadata to extract topics. This work presents CIMAL, a framework for flexible analysis of educational material repositories; CIMAL combines semantic classification, taxonomies and graph structures to extract topics and their multiple relationships. We validated our proposal through a prototype that uses real materials from Coursera (Johns Hopkins University and University of Michigan) and Higher Education Institute, from Brazil. As far as we known, this is the first time that both slide and video features guide text analysis, topic classification techniques and relationship discovery among documents. The elicitation of topics covered in various educational documents and of their potential relationships can support teachers and students in undertaking academic activities that are more dynamic than conventional ones – e.g., in which new relationships are found between different subjects from different sources. This can also make it easier to search the most appropriate items in educational repositories to learn new concepts, enhancing the development of new courses. From the computational point of view, this research contributes to the improvement of techniques for handling unstructured documents and documents of different formats.
|
dos Santos, Andressa Cristina;
Maike, Vanessa Regina Margareth Lima;
Mendez Mendoza, Yusseli Lizeth;
da Silva, José Valderlei;
Bonacin, Rodrigo;
Dos Reis, Julio Cesar;
Baranauskas, Maria Cecília Calani
Inquiring Evaluation Aspects of Universal Design and Natural Interaction in Socioenactive Scenarios (conference)
Universal Access in Human-Computer Interaction. Theory, Methods and Tools,
Springer,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Accessibility,
Interaction evaluation,
Ubiquitous computing,
Pervasive,
Natural User Interfaces,
Universal Design,
Universal Access
)
@InProceedings{dosSantos2019i,
author="dos Santos, Andressa Cristina
and Maike, Vanessa Regina Margareth Lima
and M{\'e}ndez Mendoza, Yusseli Lizeth
and da Silva, Jos{\'e} Valderlei
and Bonacin, Rodrigo
and Dos Reis, Julio Cesar
and Baranauskas, Maria Cec{\'i}lia Calani",
editor="Antona, Margherita
and Stephanidis, Constantine",
title="Inquiring Evaluation Aspects of Universal Design and Natural Interaction in Socioenactive Scenarios",
booktitle="Universal Access in Human-Computer Interaction. Theory, Methods and Tools",
year="2019",
publisher="Springer International Publishing",
address="Cham",
pages="39--56",
abstract="New technologies and ubiquitous systems present new forms and modalities of interaction. Evaluating such systems, particularly in the novel socioenactive scenario, poses a difficult issue, as existing instruments do not capture all aspects intrinsic to such scenario. One of the key aspects is the wide range of characteristics and needs of both users and technology involved. In this paper, we are concerned with aspects of both Universal Design (UD) and Natural User Interfaces (NUIs). We present a case study where we applied, within a socioenactive scenario, evaluation instruments relying on principles and heuristics from these areas. The scenario involved six children from a hospital that treats craniofacial deformities, playing in a rich interactive environment with displays and plush animals that respond to hugs. Our results based on the analysis of the evaluation conducted in the case study suggest informed recommendations of how to use the evaluation instruments in the context of socioenactive systems and their limitations.",
isbn="978-3-030-23560-4"
}
New technologies and ubiquitous systems present new forms and modalities of interaction. Evaluating such systems, particularly in the novel socioenactive scenario, poses a difficult issue, as existing instruments do not capture all aspects intrinsic to such scenario. One of the key aspects is the wide range of characteristics and needs of both users and technology involved. In this paper, we are concerned with aspects of both Universal Design (UD) and Natural User Interfaces (NUIs). We present a case study where we applied, within a socioenactive scenario, evaluation instruments relying on principles and heuristics from these areas. The scenario involved six children from a hospital that treats craniofacial deformities, playing in a rich interactive environment with displays and plush animals that respond to hugs. Our results based on the analysis of the evaluation conducted in the case study suggest informed recommendations of how to use the evaluation instruments in the context of socioenactive systems and their limitations.
|
Ramos, Pedro Alan T.;
dos Reis, Julio Cesar;
de Souza dos Santos, Antonio Alberto;
Schiozer, Denis José
Participatory Design of System Messages in Petroleum Fields Management Software (conference)
Human-Computer Interaction. Design Practice in Contemporary Societies,
Springer,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Participatory design,
Reservoir simulations,
Braindrawing",
Brainwriting,
Scientific software,
Application System Messages
)
@InProceedings{ramos2019p,
author="Ramos, Pedro Alan T.
and dos Reis, Julio Cesar
and de Souza dos Santos, Antonio Alberto
and Schiozer, Denis Jos{\'e}",
editor="Kurosu, Masaaki",
title="Participatory Design of System Messages in Petroleum Fields Management Software",
booktitle="Human-Computer Interaction. Design Practice in Contemporary Societies",
year="2019",
publisher="Springer International Publishing",
address="Cham",
pages="459--475",
abstract="Users face difficulties in understanding the progress of simulation tasks in oil reservoirs. It is necessary to turn clear to users when some task suffers errors because this is time consuming. In this paper, we propose the use of participatory design to conceive Application System Messages (ASMs) in software tools implemented to support studies related to Numerical Simulation and Management of Petroleum Reservoirs. We explored braindrawing and brainwriting techniques to acquire early concepts for a redesign of the ASMs' presentation and content. Our obtained results indicate that the use of participatory practices is useful to improve the redesign of ASMs in our study context.",
isbn="978-3-030-22636-7"
}
Users face difficulties in understanding the progress of simulation tasks in oil reservoirs. It is necessary to turn clear to users when some task suffers errors because this is time consuming. In this paper, we propose the use of participatory design to conceive Application System Messages (ASMs) in software tools implemented to support studies related to Numerical Simulation and Management of Petroleum Reservoirs. We explored braindrawing and brainwriting techniques to acquire early concepts for a redesign of the ASMs' presentation and content. Our obtained results indicate that the use of participatory practices is useful to improve the redesign of ASMs in our study context.
|
Caceffo, Ricardo;
Alves Moreira, Eliana;
Bonacin, Rodrigo;
dos Reis, Julio Cesar;
Luque Carbajal, Marleny;
D'Abreu, João Vilhete V.;
Brennand, Camilla V. L. T.;
Lombello, Luma;
Valente, José Armando;
Baranauskas, Maria Cecíia Calani
Collaborative Meaning Construction in Socioenactive Systems: Study with the mBot (conference)
Learning and Collaboration Technologies. Designing Learning Experiences,
Springer,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Enactive,
Educational,
Robots,
Interactive design,
Evaluation,
Ontologies,
Emotions,
HCI
)
@InProceedings{caceffo2019c,
author="Caceffo, Ricardo
and Alves Moreira, Eliana
and Bonacin, Rodrigo
and dos Reis, Julio Cesar
and Luque Carbajal, Marleny
and D'Abreu, Jo{\~a}o Vilhete V.
and Brennand, Camilla V. L. T.
and Lombello, Luma
and Valente, Jos{\'e} Armando
and Baranauskas, Maria Cec{\'i}lia Calani",
editor="Zaphiris, Panayiotis
and Ioannou, Andri",
title="Collaborative Meaning Construction in Socioenactive Systems: Study with the mBot",
booktitle="Learning and Collaboration Technologies. Designing Learning Experiences",
year="2019",
publisher="Springer International Publishing",
address="Cham",
pages="237--255",
abstract="The design of interactive systems concerned with the impact of the technology on the human agent as well as the effect of the human experience on the technology is not a trivial task. Our investigation goes towards a vision of socioenactive systems, by supporting and identifying how a group of people can dynamically and seamlessly interact with the technology. In this paper, we elaborate a set of guidelines to design socioenactive systems. We apply them in the construction of a technological framework situated in an educational environment for children around the age of 5 (N = 25). The scenario was supported by educational robots, programmed to perform a set of actions mimicking human emotional expressions. The system was designed to shape the robots' behavior according to the feedback of children's responses in iterative sessions. This entails a complete cycle, where the robot impacts the children and is affected by their experiences. We found that children create hypotheses to make sense of the robot's behavior. Our results present original aspects related to a social enactive system.",
isbn="978-3-030-21814-0"
}
The design of interactive systems concerned with the impact of the technology on the human agent as well as the effect of the human experience on the technology is not a trivial task. Our investigation goes towards a vision of socioenactive systems, by supporting and identifying how a group of people can dynamically and seamlessly interact with the technology. In this paper, we elaborate a set of guidelines to design socioenactive systems. We apply them in the construction of a technological framework situated in an educational environment for children around the age of 5 (N = 25). The scenario was supported by educational robots, programmed to perform a set of actions mimicking human emotional expressions. The system was designed to shape the robots' behavior according to the feedback of children's responses in iterative sessions. This entails a complete cycle, where the robot impacts the children and is affected by their experiences. We found that children create hypotheses to make sense of the robot's behavior. Our results present original aspects related to a social enactive system.
|
Victorelli, E.Z.;
Dos Reis, J.C.;
Souza Santos, A.A.;
Schiozer, D.J.
Design process for human-data interaction: Combining guidelines with semio-participatory techniques (conference)
ICEIS 2019 - Proceedings of the 21st International Conference on Enterprise Information Systems,
ICEIS,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Human-Data Interaction,
Design Approaches,
Visual Analytics
)
@CONFERENCE{Victorelli2019d,
author={Victorelli, E.Z. and Dos Reis, J.C. and Souza Santos, A.A. and Schiozer, D.J.},
title={Design process for human-data interaction: Combining guidelines with semio-participatory techniques},
journal={ICEIS 2019 - Proceedings of the 21st International Conference on Enterprise Information Systems},
year={2019},
volume={2},
pages={410-421},
url={https://www.scopus.com/inward/record.uri?eid=2-s2.0-85067430856&partnerID=40&md5=f5ef73373a1a2196c5c421dece0d9a1b},
author_keywords={Design Approaches; Human-Data Interaction; Visual Analytics},
document_type={Conference Paper},
source={Scopus}
}
The complexity of analytically reasoning to extract and identify useful knowledge from large masses of data requires that the design of visual analytics tools addresses challenges of facilitating human-data interaction (HDI). Designing data visualisation based on guidelines is fast and low-cost, but does not favour the engagement of people in the process. In this paper, we propose a design process to integrate design based on guidelines with participatory design practices. We investigate, and when necessary, adapt existing practices for each step of our design process. The process was evaluated on a design problem involving a visual analytics tool supporting decisions related to the production strategy in oil reservoirs with the participation of key stakeholders. The generated prototype was tested with adapted participatory evaluation practices. The obtained results indicate participants’ satisfaction with the design practices used and detected the fulfilment of users’ needs. The design process and the associated practices may serve as a basis for improving the HDI in other contexts.
|
Venero, Sheila Katherine;
dos Reis, Julio Cesar;
Montecchi, Leonardo;
Rubira, Cecília Mary Fischer
Towards a Metamodel for Supporting Decisions in Knowledge-Intensive Processes (conference)
Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing,
ACM,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
knowledge-intensive process,
business process management systems,
process-aware information systems,
business process modeling,
case management,
knowledge management
)
@inproceedings{venero2019t,
author = {Venero, Sheila Katherine and Reis, Julio Cesar dos and Montecchi, Leonardo and Rubira, Cec\'{\i}lia Mary Fischer},
title = {Towards a Metamodel for Supporting Decisions in Knowledge-Intensive Processes},
year = {2019},
isbn = {9781450359337},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3297280.3297290},
doi = {10.1145/3297280.3297290},
booktitle = {Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing},
pages = {75–84},
numpages = {10},
keywords = {knowledge-intensive process, business process management systems, process-aware information systems, business process modeling, case management, knowledge management},
location = {Limassol, Cyprus},
series = {SAC ’19}
}
Knowledge-intensive processes (KiPs) cannot be fully specified at design time because not all information about the process is available prior to its execution. At runtime, new information emerges reflecting environment changes or unexpected outcomes. The structure of this kind of processes varies from case to case and it is defined step-by-step based on knowledge worker's decisions made after analyzing the current situation. These decisions rely on the knowledge worker's experience and available information. Current process management approaches still need to adequately address the complex characteristics of knowledge-intensive processes, such as their unpredictability, emergency, non-repeatability, and dynamism. This paper proposes a metamodel for representing KiPs aiming to help knowledge workers during the decision-making process. Domain and organizational knowledge are modeled by objectives and tactics. The metamodel supports the definition of objectives, metrics, tactics, goals and strategies at runtime according to a specific situation. Also, it includes concepts related to context and environment elements, business artifacts, roles and rules. The feasibility of our model was evaluated via a proof of concept in the medical domain.
|
Wang, Y.;
Dos Reis, J.C.;
Borggren, K.M.;
Vaz Salles, M.A.;
Medeiros, C.B.;
Zhou, Y.
Modeling and building IoT data platforms with actor-oriented databases (conference)
Advances in Database Technology,
EDBT,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
)
@CONFERENCE{Wang2019m,
author={Wang, Y. and Dos Reis, J.C. and Borggren, K.M. and Vaz Salles, M.A. and Medeiros, C.B. and Zhou, Y.},
title={Modeling and building IoT data platforms with actor-oriented databases},
journal={Advances in Database Technology - EDBT},
year={2019},
volume={2019-March},
pages={512-523},
doi={10.5441/002/edbt.2019.47},
url={https://www.scopus.com/inward/record.uri?eid=2-s2.0-85064947935&doi=10.5441%2f002%2fedbt.2019.47&partnerID=40&md5=3b80806b1652019da853394ca911f61b},
document_type={Conference Paper},
source={Scopus}
}
Vast amounts of data are being generated daily with the adoption of Internet-of-Things (IoT) solutions in an ever-increasing number of application domains. There are problems associated with all stages of the lifecycle of these data (e.g., capture, curation and preservation). Moreover, the volume, variety, dynamicity and ubiquity of IoT data present additional challenges to their usability, prompting the need for constructing scalable dataintensive IoT data management and processing platforms. This paper presents a novel approach to model and build IoT data platforms based on the characteristics of an Actor-Oriented Database (AODB). We take advantage of two complementary case studies – in structural health monitoring and beef cattle tracking and tracing – to describe novel software requirements introduced by IoT data processing. Our investigation illustrates the challenges and benefits provided by AODB to meet these requirements in terms of modeling and IoT-based systems implementation. Obtained results reveal the advantages of using AODB in IoT scenarios and lead to principles on how to effectively use an actor model to design and implement IoT data platforms.
|
Moreira, Eliana;
DOS REIS, Julio;
Baranauskas, Maria Cecília
Tangible Artifacts and the Evaluation of Affective States by children (journal)
Brazilian Journal of Computers in Education,
RBIE,
journal,
2019.
(
Abstract |
Links |
BibTeX |
Tags:
Tangible interfaces,
Evaluation,
Affective states,
Ludic
)
@article{moreira2019t,
author = {Eliana Moreira e Julio DOS REIS e Maria Cecília Baranauskas},
title = {Artefatos Tangíveis e a Avaliação de Estados Afetivos por Crianças},
journal = {Revista Brasileira de Informática na Educação},
volume = {27},
number = {01},
year = {2019},
keywords = {Tangible interfaces; Evaluation; Affective states; Ludic},
abstract = {Sistemas computacionais contemporâneos e ubíquos demandam cada vez mais avaliações que consideram aspectos para além da ergonomia, usabilidade e acessibilidade, para incluir também meios de entender o estado afetivo dos envolvidos na interação. Contudo, principalmente quando as partes envolvidas são crianças, é necessário promover meios lúdicos e acessíveis para envolver as pessoas nas atividades de avaliação, pois espera-se que a ferramenta utilizada na avaliação permita que os envolvidos se expressem de acordo com sua idade e compreensão. Trabalhos existentes propõem soluções abstratas que dificultam a compreensão e a participação das pessoas na expressão de estados afetivos. Neste artigo, desenvolvemos e avaliamos o ambiente TangiSAM, que engloba conjuntos de bonecos tridimensionais concretos que se utilizam de tecnologias tangíveis que permitem efetuar avaliação de estados afetivos de maneira lúdica. Conduzimos um estudo em um espaço educativo real com crianças e professoras para entender se os artefatos tangíveis do TangiSAM favorecem uma melhor experiência de autoavaliação. Descobrimos que o TangiSAM obteve maior preferência pelos participantes quando comparado com outras propostas de representação de estados afetivos.},
issn = {2317-6121}, pages = {58} doi = {10.5753/rbie.2019.27.01.58},
url = {https://www.br-ie.org/pub/index.php/rbie/article/view/7753}
}
Modern and ubiquitous computational systems increasingly demand more evaluations, which consider aspects beyond ergonomy, usability and accessibility to include means of understanding the affective states of those involved in the interaction. Nevertheless, whenever the involved parties are predominantly children, it becomes necessary to promote ludic and accessible means of involving people in the evaluation activities, because it is expected that the assessment tool used allows all stakeholdersto express themselves according to their age and understanding. Existing studies have proposed abstract solutions that difficult the comprehension and participation of those involved in the expression of affective states. In this article, we developed and evaluated the TangiSAM environment, which includes sets of tridimensional concrete manikins that take advantage of tangible technologies, allowing the assessment of affective states in a ludic manner. We conducted an evaluation in a real-world educational setting, including both children and teachers, in order to understand whether the TangiSAM’s tangible artifacts favor a better selfevaluation experience. We found that TangiSAM was more frequently assigned as the most favorite by the participants in the comparison to other affective-state representation proposals.
|
2018 |
Virginio, Luiz;
dos Reis, Julio Cesar
Automated Coding of Medical Diagnostics from Free-Text: The Role of Parameters Optimization and Imbalanced Classes (conference)
Data Integration in the Life Sciences,
Springer,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
Automated ICD coding,
Multi-label classification,
Imbalanced classes
)
@InProceedings{virginio2018a,
author="Virginio, Luiz
and dos Reis, Julio Cesar",
editor="Auer, S{\"o}ren
and Vidal, Maria-Esther",
title="Automated Coding of Medical Diagnostics from Free-Text: The Role of Parameters Optimization and Imbalanced Classes",
booktitle="Data Integration in the Life Sciences",
year="2019",
publisher="Springer International Publishing",
address="Cham",
pages="122--134",
abstract="The extraction of codes from Electronic Health Records (EHR) data is an important task because extracted codes can be used for different purposes such as billing and reimbursement, quality control, epidemiological studies, and cohort identification for clinical trials. The codes are based on standardized vocabularies. Diagnostics, for example, are frequently coded using the International Classification of Diseases (ICD), which is a taxonomy of diagnosis codes organized in a hierarchical structure. Extracting codes from free-text medical notes in EHR such as the discharge summary requires the review of patient data searching for information that can be coded in a standardized manner. The manual human coding assignment is a complex and time-consuming process. The use of machine learning and natural language processing approaches have been receiving an increasing attention to automate the process of ICD coding. In this article, we investigate the use of Support Vector Machines (SVM) and the binary relevance method for multi-label classification in the task of automatic ICD coding from free-text discharge summaries. In particular, we explored the role of SVM parameters optimization and class weighting for addressing imbalanced class. Experiments conducted with the Medical Information Mart for Intensive Care III (MIMIC III) database reached 49.86{\%} of f1-macro for the 100 most frequent diagnostics. Our findings indicated that optimization of SVM parameters and the use of class weighting can improve the effectiveness of the classifier.",
isbn="978-3-030-06016-9"
}
The extraction of codes from Electronic Health Records (EHR) data is an important task because extracted codes can be used for different purposes such as billing and reimbursement, quality control, epidemiological studies, and cohort identification for clinical trials. The codes are based on standardized vocabularies. Diagnostics, for example, are frequently coded using the International Classification of Diseases (ICD), which is a taxonomy of diagnosis codes organized in a hierarchical structure. Extracting codes from free-text medical notes in EHR such as the discharge summary requires the review of patient data searching for information that can be coded in a standardized manner. The manual human coding assignment is a complex and time-consuming process. The use of machine learning and natural language processing approaches have been receiving an increasing attention to automate the process of ICD coding. In this article, we investigate the use of Support Vector Machines (SVM) and the binary relevance method for multi-label classification in the task of automatic ICD coding from free-text discharge summaries. In particular, we explored the role of SVM parameters optimization and class weighting for addressing imbalanced class...
|
Carvalho, Lucas
Reproducibility and Reuse of Experiments in eScience: Workflows, Ontologies and Scripts (phdthesis)
University of Campinas - Institute of Computing,
phdthesis,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
scientific workflows,
ontologies,
reuse,
reproducibility
)
@phdthesis{Carvalho2018b,
abstract = {Scripts and Scientific Workflow Management Systems (SWfMSs) are common approaches that have been used to automate the execution flow of processes and data analysis in scientific (computational) experiments. Although widely used in many disciplines, scripts are hard to understand, adapt, reuse, and reproduce. For this reason, several solutions have been proposed to aid experiment reproducibility for script-based environments. However, they neither allow to fully document the experiment nor do they help when third parties want to reuse just part of the code. SWfMSs, on the other hand, help documentation and reuse by supporting scientists in the design and execution of their experiments, which are specified and run as interconnected (reusable) workflow components (a.k.a. building blocks). While workflows are better than scripts for understandability and reuse, they still require additional documentation. During experiment design, scientists frequently create workflow variants, e.g., by changing workflow components. Reuse and reproducibility require understanding and tracking variant provenance, a time-consuming task. This thesis aims to support reproducibility and reuse of computational experiments. To meet these challenges, we address two research problems: (1) understanding a computational experiment, and (2) extending a computational experiment. Our work towards solving these problems led us to choose workflows and ontologies to answer both problems. The main contributions of this thesis are thus: (i) to present the requirements for the conversion of script to reproducible research; (ii) to propose a methodology that guides the scientists through the process of conversion of script-based experiments into reproducible workflow research objects; (iii) to design and implement features for quality assessment of computational experiments; (iv) to design and implement W2Share, a framework to support the conversion methodology, which exploits tools and standards that have been developed by the scientific community to promote reuse and reproducibility; (v) to design and implement OntoSoft-VFF, a framework for capturing information about software and workflow components to support scientists manage workflow exploration and evolution. Our work is showcased via use cases in Molecular Dynamics, Bioinformatics and Weather Forecasting.},
author = {Lucas Carvalho},
date = {2018-12-14},
keyword = {scientific workflows; ontologies; reuse; reproducibility},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2018/carvalho-lucas-thesis-2018.pdf},
school = {University of Campinas - Institute of Computing},
title = {Reproducibility and Reuse of Experiments in eScience: Workflows, Ontologies and Scripts},
year = {2018}
}
Scripts and Scientific Workflow Management Systems (SWfMSs) are common approaches that have been used to automate the execution flow of processes and data analysis in scientific (computational) experiments. Although widely used in many disciplines, scripts are hard to understand, adapt, reuse, and reproduce. For this reason, several solutions have been proposed to aid experiment reproducibility for script-based environments. However, they neither allow to fully document the experiment nor do they help when third parties want to reuse just part of the code. SWfMSs, on the other hand, help documentation and reuse by supporting scientists in the design and execution of their experiments, which are specified and run as interconnected (reusable) workflow components (a.k.a. building blocks). While workflows are better than scripts for understandability and reuse, they still require additional documentation. During experiment design, scientists frequently create workflow variants, e.g., by changing workflow components. Reuse and reproducibility require understanding and tracking variant provenance, a time-consuming task. This thesis aims to support reproducibility and reuse of computational experiments. To meet these challenges, we address two research problems: (1) understanding a computational experiment, and (2) extending a computational experiment. Our work towards solving these problems led us to choose workflows and ontologies to answer both problems. The main contributions of this thesis are thus: (i) to present the requirements for the conversion of script to reproducible research; (ii) to propose a methodology that guides the scientists through the process of conversion of script-based experiments into reproducible workflow research objects; (iii) to design and implement features for quality assessment of computational experiments; (iv) to design and implement W2Share, a framework to support the conversion methodology, which exploits tools and standards that have been developed by the scientific community to promote reuse and reproducibility; (v) to design and implement OntoSoft-VFF, a framework for capturing information about software and workflow components to support scientists manage workflow exploration and evolution. Our work is showcased via use cases in Molecular Dynamics, Bioinformatics and Weather Forecasting.
|
D’Abreu, J. V. V.;
DOS REIS, J. C.
Robótica Pedagógica no NIED: contribuições e perspectivas futuras ()
Tecnologia e Educação: passado, presente e o que está por vir,
NIED,
,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
)
A Robótica Educacional (RE) é uma área de conhecimento que integra diversas disciplinas. Nas escolas, muitas vezes, ela é inserida como forma de se buscar uma abordagem interdisciplinar e propiciar o uso de tecnologias na educação. Essas tecnologias envolvem o uso de kits e de materiais para a montagem de robôs, software para programá-los e, consequentemente, computadores (nos seus mais variados modelos e formatos) para programar a automação e o controle do robô construído. Adicionalmente, esses aspectos devem ser orientados por uma metodologia para potencializar/qualificar o uso da RE como ferramenta capaz de diversificar e enriquecer o ambiente de ensino e aprendizagem nos mais diferentes níveis, do básico ao superior...
|
Carvalho, Lucas A. M. C.;
Garijo, Daniel;
Medeiros, Claudia Bauzer;
Gil, Yolanda
Semantic Software Metadata for Workflow Exploration and Evolution (conference)
Proceedings of the 2018 IEEE 14th International Conference on eScience,
IEEE,
Amsterdam, the Netherlands, October 28-November 01,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
scientific workflows,
software metadata,
software functions,
software registries,
workflow evolution
)
@conference{Carvalho2018Semantic,
abstract = {Scientific workflow management systems play a major role in the design, execution and documentation of computational experiments. However, they have limited support for managing workflow evolution and exploration because they lack rich metadata for the software that implements workflow components. Such metadata could be used to support scientists in exploring local adjustments to a workflow, replacing components with similar software, or upgrading components upon release of newer software versions. To address this challenge, we propose OntoSoft-VFF (Ontology for Software Version, Function and Functionality), a software metadata repository designed to capture information about software and workflow components that is important for managing workflow exploration and evolution. Our approach uses a novel ontology to describe the functionality and evolution through time of any software used to create workflow components. OntoSoft-VFF is implemented as an online catalog that stores semantic metadata for software to enable workflow exploration through understanding of software functionality and evolution. The catalog also supports comparison and semantic search of software metadata. We showcase OntoSoft-VFF using machine learning workflow examples. We validate our approach by testing that a workflow system could compare differences in software metadata, explain software updates, and describe the general functionality of workflow steps.},
address = {Amsterdam, the Netherlands, October 28-November 01},
author = {Lucas A. M. C. Carvalho and Khalid Belhajjame and Claudia Bauzer Medeiros},
booktitle = {Proceedings of the 2018 IEEE 14th International Conference on eScience},
date = {2018-10-28},
keyword = {scientific workflows, software metadata, software functions, software registries, workflow evolution},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2018/semantic-software-metadata-for-workflow-exploration-and-evolution-camera-ready.pdf},
publisher = {IEEE},
title = {Semantic Software Metadata for Workflow Exploration and Evolution},
year = {2018}
}
Scientific workflow management systems play a major role in the design, execution and documentation of computational experiments. However, they have limited support for managing workflow evolution and exploration because they lack rich metadata for the software that implements workflow components. Such metadata could be used to support scientists in exploring local adjustments to a workflow, replacing components with similar software, or upgrading components upon release of newer software versions. To address this challenge, we propose OntoSoft-VFF (Ontology for Software Version, Function and Functionality), a software metadata repository designed to capture information about software and workflow components that is important for managing workflow exploration and evolution. Our approach uses a novel ontology to describe the functionality and evolution through time of any software used to create workflow components. OntoSoft-VFF is implemented as an online catalog that stores semantic metadata for software to enable workflow exploration through understanding of software functionality and evolution. The catalog also supports comparison and semantic search of software metadata. We showcase OntoSoft-VFF using machine learning workflow examples. We validate our approach by testing that a workflow system could compare differences in software metadata, explain software updates, and describe the general functionality of workflow steps.
|
Justo, Andrey Victor;
dos Reis, Julio Cesar;
Calado, Ivo;
Bonacin, Rodrigo;
Jensen, Felipe Rodrigues
Exploring Ontologies to Improve the Empathy of Interactive Bots (conference)
2018 IEEE 27th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE),
IEEE,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
Interactive bots,
Affectivity,
Ontologies,
SWRL
)
@INPROCEEDINGS{justo2018e,
author={A. V. {Justo} and J. {Cesar dos Reis} and I. {Calado} and R. {Bonacin} and F. R. {Jensen}},
booktitle={2018 IEEE 27th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE)},
title={Exploring Ontologies to Improve the Empathy of Interactive Bots},
year={2018},
volume={},
number={},
pages={261-266},
abstract={Bots are virtual agents that people can interact with text messages. They are mostly made with the aim of mimicking a person in conversations. Although several studies have devised natural language processing techniques for the creation of bots, few studies explore the use of ontologies in the development of novel context-aware interactive bots. In this article, we propose a software architecture that allows ontology-based interpretation of several types of data (audio, video, and text) from the bot's environment. We define formal concept-based rules to express affective behavior aiming to improve the empathy of bots. The proposed technique relies on Semantic technologies such as OWL and SWRL languages. This technique is illustrated in an interaction scenario.},
keywords={interactive systems;knowledge representation languages;natural language processing;ontologies (artificial intelligence);software architecture;interaction scenario;virtual agents;text messages;natural language;context-aware interactive bots;ontology-based interpretation;formal concept-based rules;OWL;SWRL languages;semantic technologies;Ontologies;Computer architecture;Personal digital assistants;Semantics;Proposals;Neurons;Face;Interactive bots;Affectivity;Ontogologies;SWRL},
doi={10.1109/WETICE.2018.00057},
ISSN={1524-4547},
month={June},}
Bots are virtual agents that people can interact with text messages. They are mostly made with the aim of mimicking a person in conversations. Although several studies have devised natural language processing techniques for the creation of bots, few studies explore the use of ontologies in the development of novel context-aware interactive bots. In this article, we propose a software architecture that allows ontology-based interpretation of several types of data (audio, video, and text) from the bot's environment. We define formal concept-based rules to express affective behavior aiming to improve the empathy of bots. The proposed technique relies on Semantic technologies such as OWL and SWRL languages. This technique is illustrated in an interaction scenario.
|
Destro, Juliana Medeiros;
dos Santos, Gabriel Oliveira;
dos Reis, Julio Cesar;
Torres, Ricardo da S.;
Carvalho, Ariadne Maria B. R.;
Ricarte, Ivan Luiz Marques
EVOCROS: Results for OAEI 2018 (conference)
The Thirteenth International Workshop on Ontology Matching - International Semantic Web Conference ISWC-2018,
CEUR-WS,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
cross-lingual matching,
semantic matching,
background knowledge
)
@inproceedings{destro2018e,
title={EVOCROS: Results for OAEI 2018},
author={Destro, Juliana Medeiros and dos Santos, Gabriel Oliveira and dos Reis, Julio Cesar and Torres, Ricardo da S. and Carvalho, Ariadne Maria B. R. and Ricarte, Ivan Luiz Marques},
year={2018}
}
This paper describes EVOCROS, a cross-lingual ontology alignment system suited to create mappings between ontologies described in different natural language. Our tool combines semantic and syntactic similarity measures in a weighted average metric. The semantic is computed via NASARI vectors used together with BabelNet, which is a domain-neutral semantic network. The tool employs automatic translation to a pivot language to consider the similarity. EVOCROS was tested and obtained high quality alignment in the Multifarm dataset. We discuss the experimented configurations and the achieved results in OAEI 2018. This is our first participation in OAEI.
|
Abdessalem, Talel;
Medeiros, Claudia B.;
Cellary, W. ;
Gancarski, W.;
Manouvrier, M.;
”Rukoz;
M.”;
”Zam;
M.”
The Database Version Approach: Overview and Future directions (conference)
34ème Conférence sur la Gestion de Données - Principes, Technologies et Applications (BDA 2018),
2018.
(
Abstract |
Links |
BibTeX |
Tags:
Database versions
)
@inproceedings{ abmece18,
author = { Abdessalem, Talel and Medeiros, Claudia Bauzer and Cellary, W. and Gancarski, W. and Manouvrier, M. and Rukoz, M. and Zam, M.},
booktitle = {34ème Conférence sur la Gestion de Données - Principes, Technologies et Applications (BDA 2018)},
pages = {1-10},
address={Bucarest},
title = {{ The Database Version Approach: Overview and Future directions }},
year = {2018}
}
|
Dos Reis, Julio Cesar;
Bonacin, Rodrigo;
Hornung, Heiko Horst;
Baranauskas, Maria Cecília Calani
Intenticons: Participatory selection of emoticons for communication of intentions (journal)
Computers in Human Behavior,
Elsevier,
journal,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
Emoticons,
Meanings,
Intentions,
Pragmatics,
Computer-mediated communication,
User participation
)
@article{DOSREIS2018146,
title = "Intenticons: Participatory selection of emoticons for communication of intentions",
journal = "Computers in Human Behavior",
volume = "85",
pages = "146 - 162",
year = "2018",
issn = "0747-5632",
doi = "https://doi.org/10.1016/j.chb.2018.03.046",
url = "http://www.sciencedirect.com/science/article/pii/S0747563218301511",
author = "Julio Cesar [dos Reis] and Rodrigo Bonacin and Heiko Horst Hornung and M. Cecília C. Baranauskas",
keywords = "Emoticons, Meanings, Intentions, Pragmatics, Computer-mediated communication, User participation",
abstract = "Previous studies have emphasised that emoticons are able to express more than emotions, assuming a central role on computer mediation communication. Explicit consideration of intentions in computer systems might play a significant role for improving communication and collaboration. Nevertheless, web-mediated communication lacks elements that are natural in face-to-face conversation for signalling intention. In this article, we propose so-called Intenticons as a set of emoticons designed (and/or selected) to communicate intentions as an interactive mechanism to support users in expressing intentions. This study presents an experimental analysis to evaluate whether Intenticons designed in a participatory way convey intentions better than emoticons selected by designers in a non-participatory way. We rely on a theoretical framework based on Speech Act Theory and Semiotics to categorize different classes of intentions. The achieved results, based on statistical tests, revealed that the Intenticons were more adequate for most of the intention classes. Our findings demonstrated the value of the user involvement for obtaining adequate emoticons in intention sharing."
}
Previous studies have emphasised that emoticons are able to express more than emotions, assuming a central role on computer mediation communication. Explicit consideration of intentions in computer systems might play a significant role for improving communication and collaboration. Nevertheless, web-mediated communication lacks elements that are natural in face-to-face conversation for signalling intention. In this article, we propose so-called Intenticons as a set of emoticons designed (and/or selected) to communicate intentions as an interactive mechanism to support users in expressing intentions. This study presents an experimental analysis to evaluate whether Intenticons designed in a participatory way convey intentions better than emoticons selected by designers in a non-participatory way. We rely on a theoretical framework based on Speech Act Theory and Semiotics to categorize different classes of intentions. The achieved results, based on statistical tests, revealed that the Intenticons were more adequate for most of the intention classes. Our findings demonstrated the value of the user involvement for obtaining adequate emoticons in intention sharing.
|
Saraiva, Márcio de Carvalho;
Medeiros, Claudia Bauzer
Relating educational materials via extraction of their topics (conference)
Proceedings of the VLDB 2018 Ph.D. Workshop, August 27, 2018,
Rio de Janeiro, Rio de Janeiro, Brazil,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
Components, Content analysis and feature selection, educational material, Information extraction, Topics Classification
)
@conference{deSaraiva2018,
abstract = {Digital educational documents are growing in size and variety,
and scientists are facing difficulties to find their way
through them. One of the initiatives that have emerged to
solve this problem involves the use of automatic classification
algorithms. However, it is difficult to analyze implicit
relationships among topics of materials. This paper presents
CIMAL, a framework for enabling flexible access to material
stored in arbitrary repositories. CIMAL combines semantic
classification, taxonomies and graphs to elicit relationships
among topics of educational documents. We validated
our work using materials from Coursera (courses offered by
Johns Hopkins University and University of Michigan) and
a Higher Education Institute, from Brazil.},
address = {Rio de Janeiro, Rio de Janeiro, Brazil},
author = {Márcio de Carvalho Saraiva and Claudia Bauzer Medeiros},
booktitle = {Proceedings of the VLDB 2018 Ph.D. Workshop, August 27, 2018. Rio de
Janeiro, Brazil},
date = {2018-08-27},
keyword = {Components, Content analysis and feature selection, educational material, Information extraction, Topics Classification},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2018/Marcio-PHDVLDB.pdf"},
organization = {IEEE},
title = {Relating educational materials via extraction of their topics},
year = {2018}
}
Digital educational documents are growing in size and variety, and scientists are facing difficulties to find their way through them. One of the initiatives that have emerged to solve this problem involves the use of automatic classification algorithms. However, it is difficult to analyze implicit relationships among topics of materials. This paper presents CIMAL, a framework for enabling flexible access to material stored in arbitrary repositories. CIMAL combines semantic classification, taxonomies and graphs to elicit relationships among topics of educational documents. We validated our work using materials from Coursera (courses offered by Johns Hopkins University and University of Michigan) and a Higher Education Institute, from Brazil.
|
Saraiva, Márcio de Carvalho;
Medeiros, Claudia Bauzer
Correlating Educational Documents from Different Sources Through Graphs and Taxonomies (conference)
Proceedings of the SBC 33rd Brazilian Symposium on Databases (SBBD) 2018, Rio de Janeiro, Rio de Janeiro, Brazil,
Rio de Janeiro, Rio de Janeiro, Brazil,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
Components, Content analysis and feature selection, educational material, Information extraction, Topics Classification
)
@conference{deSaraiva2018b,
abstract = {Digital educational documents are growing in size and variety, and
scientists are facing difficulties to find their way through them. One of the initiatives
that have emerged to solve this problem involves the use of automatic
classification algorithms. However, it is difficult to analyze implicit relationships
among topics of materials. This paper presents CIMAL, a framework for
enabling flexible access to material stored in arbitrary repositories. CIMAL
combines semantic classification, taxonomies and graphs to elicit relationships
among topics of educational documents. We validated our work using materials
from Coursera (courses offered by Johns Hopkins University and University of
Michigan) and a Higher Education Institute, from Brazil.},
address = {Rio de Janeiro, Rio de Janeiro, Brazil},
author = {Márcio de Carvalho Saraiva and Claudia Bauzer Medeiros},
booktitle = {Proceedings of the SBC 33rd Brazilian Symposium on Databases (SBBD) 2018, Rio de
Janeiro, Brazil},
date = {2018-08-25},
keyword = {Components, Content analysis and feature selection, educational material, Information extraction, Topics Classification},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2018/Marcio-SBBD2018.pdf"},
organization = {IEEE},
title = {Correlating Educational Documents from Different Sources Through Graphs and Taxonomies},
year = {2018}
}
Digital educational documents are growing in size and variety, and scientists are facing difficulties to find their way through them. One of the initiatives that have emerged to solve this problem involves the use of automatic classification algorithms. However, it is difficult to analyze implicit relationships among topics of materials. This paper presents CIMAL, a framework for enabling flexible access to material stored in arbitrary repositories. CIMAL combines semantic classification, taxonomies and graphs to elicit relationships among topics of educational documents. We validated our work using materials from Coursera (courses offered by Johns Hopkins University and University of Michigan) and a Higher Education Institute, from Brazil.
|
de Araújo, Ricardo José;
Dos Reis, Julio Cesar;
Bonacin, Rodrigo
Understanding interface recoloring aspects by colorblind people: a user study (journal)
Universal Access in the Information Society,
Springer,
journal,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
Colorblind people,
Interface recoloring,
Accessibility,
User study,
User preferences,
Recoloring algorithms,
Interface adaptation
)
@article{deAraujo2018u,
title={Understanding interface recoloring aspects by colorblind people: a user study},
author={de Ara{\'u}jo, Ricardo Jos{\'e} and Dos Reis, Julio Cesar and Bonacin, Rodrigo},
journal={Universal Access in the Information Society},
pages={1--18},
year={2018},
publisher={Springer}
}
The current web technologies make intensive the use of colors in web pages. Nowadays, colors are essential in the design of interfaces and play a central role in the distinction and comprehension of information. However, this affects colorblind users, i.e., those who have difficulties in recognizing or distinguishing colors. This paper presents a user study involving colorblind people to empirically investigate several aspects related to the recoloring of web interfaces. We aim to detect limitations, barriers, and needs about these users’ interaction with web pages. Our employed evaluation investigates indicators of satisfaction (contentment) and pleasantness (enjoyable) for several scenarios of interface recoloring adaptation. We found a ranking of application for interface adaptation techniques with the use of recoloring algorithms. The obtained results reveal the advantages of considering the colorblind individual’s needs and preferences for the development of adaptive systems. Our contribution can enhance web interface accessibility based on user interface adaptation techniques.
|
Gonçalves, Fabrício Matheus;
Jensen, Felipe Rodrigues;
dos Reis, Julio Cesar;
Baranauskas, Maria Cecília Calani
Enhancing Problem Clarification Artifacts with Online Deliberation (conference)
Proceedings of the 13th International Conference on Software Technologies - Volume 1: ICSOFT, 288-295, 2018, Porto, Portugal},
SciTePress,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
Online Deliberation,
Socially Aware Computing,
Organisational Semiotics
)
@conference{gonçalves2018e,
author={Fabrício Matheus Gon\c{C}alves. and Felipe Rodrigues Jensen. and Julio Cesar dos Reis. and Maria Cecília Calani Baranauskas.},
title={Enhancing Problem Clarification Artifacts with Online Deliberation},
booktitle={Proceedings of the 13th International Conference on Software Technologies - Volume 1: ICSOFT,},
year={2018},
pages={288-295},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006869103220329},
isbn={978-989-758-320-9}
}
Information system design demands understanding requirements from diversified stakeholders. As an initial step, the problem clarification is essential to obtain a shared view of the involved problems and solutions. Several techniques have been proposed and practiced by the systems engineering community for problem clarification. While existing literature has brought problem clarification artifacts via a online computational system, stakeholders still lack means of meaning negotiation practices that usually happen in face-to-face meetings. This paper proposes a deliberation model integrated to the online use of problem clarification artifacts. The deliberation provides a collaborative process for building common ground for reflection. The proposed model illustrates the possibilities of deliberation in statements created in three artefacts of the Organizational Semiotics: Stakeholder Identification, Evaluation Frame and Semiotic Framework.
|
dos Reis, Julio Cesar;
de Brito, Mario Ferreira
Transparência para Humanos e Máquinas: Um framework para Publicar Dados Abertos Interconectados Semanticamente Descritos (workshop)
Anais do VI Workshop de Transparência em Sistemas,
SBC,
workshop,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
)
@inproceedings{dosReis2018t,
author = {Julio Cesar dos Reis and Mario de Brito},
title = {Transparência para Humanos e Máquinas: Um framework para Publicar Dados Abertos Interconectados Semanticamente Descritos},
booktitle = {Anais do VI Workshop de Transparência em Sistemas},
location = {Natal},
year = {2018},
keywords = {},
issn = {2595-6140},
publisher = {SBC},
address = {Porto Alegre, RS, Brasil}
doi = {10.5753/wtrans.2018.3094},
url = {https://portaldeconteudo.sbc.org.br/index.php/wtrans/article/view/3094}
}
A Web Semântica permite que a semântica dos dados seja descrita explicitamente para humanos e máquinas. Transparência de dados requer a publicação estruturada dos dados para outros sistemas. Dados abertos interconectados que são disponibilizados na Web sem restrições de “copyright” são a chave para atingirmos mecanismos de transparência. Mas muitas vezes os dados a serem publicados pela organização se encontram em diversos sistemas isolados. Neste artigo, propomos um framework para possibilitar a publicação de dados interconectados abertos, partindo de diversas fontes de dados visando a transparência. O contexto de estudo é em uma universidade pública onde dados de natureza distinta necessitam ser disponibilizados aos usuários de diferentes perfis.
|
de França, Breno Bernard Nicolau;
dos Reis, Julio Cesar;
de Azevedo, Rodolfo Jardim
Desafios Sociotécnicos e Prospecções para Promover Transparência de Dados na Universidade (workshop)
Anais do VI Workshop de Transparência em Sistemas,
SBC,
workshop,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
)
@inproceedings{deFrança2018d,
author = {Breno Bernard Nicolau de França and Julio Cesar dos Reis and Rodolfo Jardim de Azevedo},
title = {Desafios Sociotécnicos e Prospecções para Promover Transparência de Dados na Universidade},
booktitle = {Anais do VI Workshop de Transparência em Sistemas},
location = {Natal},
year = {2018},
keywords = {},
issn = {2595-6140},
publisher = {SBC},
address = {Porto Alegre, RS, Brasil},
doi = {10.5753/wtrans.2018.3091},
url = {https://portaldeconteudo.sbc.org.br/index.php/wtrans/article/view/3091}
}
Transparência de dados é um aspecto chave para o desenvolvimento de diversos setores da sociedade. Em universidades públicas, a transparência potencializa o conhecimento sobre o que é desenvolvido permitindo o entendimento de onde os recursos são investidos. Neste artigo, apresentamos os resultados de um estudo sobre transparência, com base em dados coletados de duas iniciativas para difundir a cultura da transparência na UNICAMP. Identificamos desafios sociotécnicos, bem como apontamos uma solução arquitetural para facilitar os processos associados á transparência dentro da universidade e promover acesso irrestrito e facilitado aos dados públicos.
|
Dos Reis, Julio Cesar;
Bonacin, Rodrigo;
Jensen, Cristiane Josely;
Hornung, Heiko Horst;
Baranauskas, Maria Cecília Calani
Design of Interactive Mechanisms to Support the Communication of Users’ Intentions (journal)
Interacting with Computers,
Oxford University Press,
journal,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
graphical user interfaces; user interface design; computer supported collaborative work
)
@article{dosReis2018d,
author = {dos Reis, Julio Cesar and Bonacin, Rodrigo and Jensen, Cristiane Josely and Hornung, Heiko Horst and Baranauskas, Maria Cecília Calani},
title = "{Design of Interactive Mechanisms to Support the Communication of Users’ Intentions}",
journal = {Interacting with Computers},
volume = {30},
number = {4},
pages = {315-335},
year = {2018},
month = {07},
abstract = "{The communication and interpretation of users’ intentions play a key role in collaborative web discussions. However, existing computational mechanisms are not effective in supporting the expression of intentions during collaborations. In this article, we present the design of interactive mechanisms that allow users to make their intentions explicit. The study considered the domain of collaborative forums of software developers. The mechanisms design was based on semiotic principles and artifacts. They were implemented and evaluated to assess their effectiveness. We investigated to which extent the mechanisms support users in the task of interpreting message exchanges in forums that make use of the mechanisms. The results reveal the suitability of the designed interface elements, enabling more meaningful and successful communication.}",
issn = {0953-5438},
doi = {10.1093/iwc/iwy013},
url = {https://doi.org/10.1093/iwc/iwy013},
eprint = {https://academic.oup.com/iwc/article-pdf/30/4/315/25243814/iwy013.pdf},
}
The communication and interpretation of users’ intentions play a key role in collaborative web discussions. However, existing computational mechanisms are not effective in supporting the expression of intentions during collaborations. In this article, we present the design of interactive mechanisms that allow users to make their intentions explicit. The study considered the domain of collaborative forums of software developers. The mechanisms design was based on semiotic principles and artifacts. They were implemented and evaluated to assess their effectiveness. We investigated to which extent the mechanisms support users in the task of interpreting message exchanges in forums that make use of the mechanisms. The results reveal the suitability of the designed interface elements, enabling more meaningful and successful communication.
|
Bonacin, Rodrigo;
Calado, Ivo;
Dos Reis, Julio Cesar
A Metamodel for Supporting Interoperability in Heterogeneous Ontology Networks (conference)
Digitalisation, Innovation, and Transformation,
Springer,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
Ontology Chart,
OWL ontologies,
Soft ontologies,
Metamodeling
)
@InProceedings{bonacin2018m,
author="Bonacin, Rodrigo
and Calado, Ivo
and dos Reis, Julio Cesar",
editor="Liu, Kecheng
and Nakata, Keiichi
and Li, Weizi
and Baranauskas, Cecilia",
title="A Metamodel for Supporting Interoperability in Heterogeneous Ontology Networks",
booktitle="Digitalisation, Innovation, and Transformation",
year="2018",
publisher="Springer International Publishing",
address="Cham",
pages="187--196",
abstract="Ontologies are central artifacts in modern information systems. Ontology networks consider the coexistence of different ontology models in the same conceptual space. It is relevant that computational systems specified with distinct models based on different methods, as well as divergent metaphysical assumptions, exchange data to interoperate one with the other. However, there is a lack of techniques to enable the adequate conciliation among models. In this paper, we propose and formalize a metamodel to enable the construction of data models aiming to support the interoperability at the technical level. We present the use of our metamodel to conciliate, without explicit transformations, Ontology Charts from Organizational Semiotics with Semantic Web OWL ontologies and less structured models such as soft ontologies. Our results indicate the possibility of identifying an entity from one model into another, enabling data exchange and interpretation in heterogeneous ontology network.",
isbn="978-3-319-94541-5"
}
Ontologies are central artifacts in modern information systems. Ontology networks consider the coexistence of different ontology models in the same conceptual space. It is relevant that computational systems specified with distinct models based on different methods, as well as divergent metaphysical assumptions, exchange data to interoperate one with the other. However, there is a lack of techniques to enable the adequate conciliation among models. In this paper, we propose and formalize a metamodel to enable the construction of data models aiming to support the interoperability at the technical level. We present the use of our metamodel to conciliate, without explicit transformations, Ontology Charts from Organizational Semiotics with Semantic Web OWL ontologies and less structured models such as soft ontologies. Our results indicate the possibility of identifying an entity from one model into another, enabling data exchange and interpretation in heterogeneous ontology network.
|
BONACIN,Rodrigo;
DOS REIS,Julio Cesar;
Edemar,MENDES P.;
NABUCO,Olga
Exploring intentions on electronic health records retrieval. Studies with collaborative scenarios (journal)
Ingenierie des Systemes d'Information,
Lavoisier,
journal,
2018.
(
Abstract |
Links |
BibTeX |
Tags:
information retrieval,
electronic health records,
information sharing,
query expansion,
intentions,
illocutions,
speech acts theory
)
@article{bonacin2018e,
author={BONACIN,Rodrigo and DOS REIS,Julio Cesar and Edemar,MENDES P. and NABUCO,Olga},
year={2018},
title={Exploring intentions on electronic health records retrieval. Studies with collaborative scenarios},
journal={Ingenierie des Systemes d'Information},
volume={23},
number={2},
pages={111-135},
note={Copyright - Copyright Lavoisier 2018; Última atualização em - 2019-01-15},
abstract={Indépendamment des aspects positifs apportés par les dossiers médicaux partagés (DMP) informatisés, les professionnels de santé sont confrontés à des difficultés dans la sélection des documents pertinents, surtout dans la cadre des grandes bases de données lors des activités de collaboration. Dans le cadre de cet article, nous nous sommes appuyés sur le développement d’un mécanisme innovant de Recherche d’Information (RI) qui explore la représentation formelle des intentions dans les DMP. Cette recherche repose sur la théorie organisationnelle de la sémiotique et de la théorie des actes de langage afin de catégoriser plusieurs types d’intentions. Notre étude porte sur des problèmes de définition, sélection et classement des résultats de recherche et nous examinons les intentions explicitement déclarées par les utilisateurs. Notre principale contribution est le développement d’un système RI qui vient en appui au partage des connaissances de groupe via DMP. Pour évaluer cette proposition, nous avons mené une étude expérimentale d’après un référentiel DMP réel de dossiers médicaux. Deux scénarios sont définis qui impliquent un groupe interdisciplinaire de professionnels de santé. Les résultats obtenus sont analysés à partir de mesures comme la précision et le rappel et ont démontré l’efficacité de cette solution. Despite the potential benefits of Electronic Health Records (EHRs), health care pro fessionals face difficulties in the selection of relevant documents in huge repositories during collaborative activities. In this article, we investigate the development of an innovative Information Retrieval (IR) and sharing mechanism that explores the formal representation of inten- tions in EHRs. To this end, this research relies on Organizational Semiotics and Speech Acts Theory. We defined an algorithm to filter and sort search results relying on intention classes explicitly declared as query parameters in the search mechanism. As our main contribution, we developed the SiRBI IR system for supporting group knowledge sharing through EHRs. To evaluate the proposal, we conducted an experimental study using a realworld EHR repository in two search scenarios, which involve an interdisciplinary group. The obtained results demonstrated the effectiveness of the solution.},
keywords={Engineering; récupération de l’information; dossier médical électronique; expansion de requêtes; les intentions; la théorie des actes de langage; information retrieval; electronic health records; information sharing; query expansion; intentions; illocutions; speech acts theory; Repositories; Collaboration; Searching},
isbn={16331311},
language={English},
url={http://www.iieta.org/journals/isi/paper/10.3166/ISI.23.2.111-135}
}
Indépendamment des aspects positifs apportés par les dossiers médicaux partagés (DMP) informatisés, les professionnels de santé sont confrontés à des difficultés dans la sélection des documents pertinents, surtout dans la cadre des grandes bases de données lors des activités de collaboration. Dans le cadre de cet article, nous nous sommes appuyés sur le développement d’un mécanisme innovant de Recherche d’Information (RI) qui explore la représentation formelle des intentions dans les DMP. Cette recherche repose sur la théorie organisationnelle de la sémiotique et de la théorie des actes de langage afin de catégoriser plusieurs types d’intentions. Notre étude porte sur des problèmes de définition, sélection et classement des résultats de recherche et nous examinons les intentions explicitement déclarées par les utilisateurs. Notre principale contribution est le développement d’un système RI qui vient en appui au partage des connaissances de groupe via DMP. Pour évaluer cette proposition, nous avons mené une étude expérimentale d’après un référentiel DMP réel de dossiers médicaux. Deux scénarios sont définis qui impliquent un groupe interdisciplinaire de professionnels de santé. Les résultats obtenus sont analysés à partir de mesures comme la précision et le rappel et ont démontré l’efficacité de cette solution. Despite the potential benefits of Electronic Health Records (EHRs), health care pro fessionals face difficulties in the selection of relevant documents in huge repositories during collaborative activities. In this article, we investigate the development of an innovative Information Retrieval (IR) and sharing mechanism that explores the formal representation of inten- tions in EHRs. To this end, this research relies on Organizational Semiotics and Speech Acts Theory. We defined an algorithm to filter and sort search results relying on intention classes explicitly declared as query parameters in the search mechanism. As our main contribution, we developed the SiRBI IR system for supporting group knowledge sharing through EHRs. To evaluate the proposal, we conducted an experimental study using a realworld EHR repository in two search scenarios, which involve an interdisciplinary group. The obtained results demonstrated the effectiveness of the solution.
|
2017 |
Carvalho, Lucas A. M. C.;
Essawy, Bakinam T.;
Garijo, Daniel;
Medeiros, Claudia Bauzer;
Gil, Yolanda
Requirements for Supporting the Iterative Exploration of Scientific Workflow Variants (conference)
2017 Workshop on Capturing Scientific Knowledge (SciKnow), held in conjunction with the ACM International Conference on Knowledge Capture (K-CAP),
2017.
(
Abstract |
Links |
BibTeX |
Tags:
Knowledge Capture, Scientific Workflows, Workflow Variants, Workshop
)
@conference{Carvalho2017b,
abstract = {Workflow systems support scientists in capturing computational experiments and managing their execution. However, such systems are not designed to help scientists create and track the many related workflows that they build as variants, trying different software implementations and distinct ways to process data and deciding what to do next by looking at previous workflow results. An initial workflow will be changed to create many new variants thereof that differ from each other in one or more steps. Our goal is to support scientists in the iterative design of computational experiments by assisting them in the creation and management of workflow variants. In this paper, we present several use cases for creating workflow variants in hydrology, from which we specify requirements for workflow variants. We also discuss major research directions to address these requirements.},
author = {Lucas A. M. C. Carvalho and Bakinam T. Essawy and Daniel Garijo and Claudia Bauzer Medeiros and Yolanda Gil},
booktitle = {2017 Workshop on Capturing Scientific Knowledge (SciKnow), held in conjunction with the ACM International Conference on Knowledge Capture (K-CAP)},
date = {2017-12-04},
keyword = {Knowledge Capture, Scientific Workflows, Workflow Variants, Workshop},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2018/01/workflow-variants-sciknow-2017-camera-ready.pdf},
pages = {1-8},
title = {Requirements for Supporting the Iterative Exploration of Scientific Workflow Variants},
year = {2017}
}
Workflow systems support scientists in capturing computational experiments and managing their execution. However, such systems are not designed to help scientists create and track the many related workflows that they build as variants, trying different software implementations and distinct ways to process data and deciding what to do next by looking at previous workflow results. An initial workflow will be changed to create many new variants thereof that differ from each other in one or more steps. Our goal is to support scientists in the iterative design of computational experiments by assisting them in the creation and management of workflow variants. In this paper, we present several use cases for creating workflow variants in hydrology, from which we specify requirements for workflow variants. We also discuss major research directions to address these requirements.
|
Santo, Jacqueline Midlej do Espírito;
Medeiros, Claudia Bauzer
Semantic Interoperability of Clinical Data (conference)
Lecture Notes in Bioinformatics (LNBI) - Proceedings of 12th International Conference on Data Integration in the Life Sciences,
Luxemburgo, Luxemburgo,
2017.
(
Abstract |
Links |
BibTeX |
Tags:
Data integration, healthcare
)
@conference{Santo2017,
abstract = {The interoperability of clinical information systems is particularly complicated due to the use of outdated technologies and the absence of consensus about standards. The literature applies standard-based approaches to achieve clinical data interoperability, but many systems do not adopt any standard, requiring a full redesigning process. Instead, we propose a generic computational approach that combines a hierarchical organization of mediator schemas to support the interoperability across distinct data sources. Second, our work takes advantage of knowledge bases to be linked to clinical data, and exploit these semantic linkages via queries. The paper shows case studies to validate our
proposal.},
address = {Luxemburgo, Luxemburgo},
author = {Jacqueline Midlej do Espírito Santo and Claudia Bauzer Medeiros},
booktitle = {Lecture Notes in Bioinformatics (LNBI) - Proceedings of 12th International Conference on Data Integration in the Life Sciences},
date = {2017-11-14},
editor = {Springer International Publishing AG},
keyword = {Data integration, healthcare},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2017/10/DILS-jacqueline.pdf},
title = {Semantic Interoperability of Clinical Data},
volume = {10649},
year = {2017}
}
The interoperability of clinical information systems is particularly complicated due to the use of outdated technologies and the absence of consensus about standards. The literature applies standard-based approaches to achieve clinical data interoperability, but many systems do not adopt any standard, requiring a full redesigning process. Instead, we propose a generic computational approach that combines a hierarchical organization of mediator schemas to support the interoperability across distinct data sources. Second, our work takes advantage of knowledge bases to be linked to clinical data, and exploit these semantic linkages via queries. The paper shows case studies to validate our proposal.
|
Moreira, Eliana Alves;
dos Reis, Julio Cesar;
Baranauskas, M. Cecília C.
TangiSAM: Tangible Artifacts for Evaluation of Affective States (conference)
Proceedings of the XVI Brazilian Symposium on Human Factors in Computing Systems,
ACM,
2017.
(
Abstract |
Links |
BibTeX |
Tags:
Tangible interfaces,
Affective states,
Evaluation
)
@inproceedings{moreira2017t,
author = {Moreira, Eliana Alves and dos Reis, Julio Cesar and Baranauskas, M. Cec\'{\i}lia C.},
title = {TangiSAM: Tangible Artifacts for Evaluation of Affective States},
year = {2017},
isbn = {9781450363778},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3160504.3160525},
doi = {10.1145/3160504.3160525},
booktitle = {Proceedings of the XVI Brazilian Symposium on Human Factors in Computing Systems},
articleno = {47},
numpages = {10},
keywords = {Tangible interfaces, Affective states, Evaluation},
location = {Joinville, Brazil},
series = {IHC 2017}
}
Evaluation of affective states is essential for assessing people's perceptions during activities and interaction experience. There is, however, a lack of playful and accessible proposals enabling children for example to complete evaluation activities thoroughly. This paper proposes the TangiSAM, a technological environment with tangible three-dimensional manikins representing the affective state in the dimensions of pleasure, arousal and dominance. We present the results of a study conducted to investigate the usage of our proposal in a real-world setting with children and teachers. Obtained results showed that the TangiSAM was more effective than other approaches for evaluation.
|
Diaz, Juan S. B.;
Medeiros, Claudia Bauzer
WorkflowHunt: combining keyword and semantic search in scientific workflow repositories (conference)
Proceedings of the IEEE 13th International Conference on eScience 2017,
IEEE,
Auckland, New Zealand,
2017.
(
Abstract |
Links |
BibTeX |
Tags:
Scientific Workflows, Semantic Annotation, Workflow Retrieval
)
@conference{Diaz2017,
abstract = {Scientific datasets, and the experiments that analyze them are growing in size and complexity, and scientists are facing difficulties to share such resources. Some initiatives have emerged to try to solve this problem. One of them involves the use of scientific workflows to represent and enact experiment execution. There is an increasing number of workflows that are potentially relevant for more than one scientific domain. However, it is hard to find workflows suitable for reuse given an experiment. Creating a workflow takes time and resources, and their reuse helps scientists to build new workflows faster and in a more reliable way. Search mechanisms in workflow repositories should provide different options for workflow discovery, but it is difficult for generic repositories to provide multiple mechanisms. This paper presents WorkflowHunt, a hybrid architecture for workflow search and discovery for generic repositories, which combines keyword and semantic search to allow finding relevant workflows using different search methods. We validated our architecture creating a prototype that uses real workflows and metadata from myExperiment, and compare search results via WorkflowHunt and via myExperiment’s search interface.},
address = {Auckland, New Zealand},
author = {Juan S. B. Diaz and Claudia Bauzer Medeiros},
booktitle = {Proceedings of the IEEE 13th International Conference on eScience 2017},
date = {2017-10-24},
keyword = {Scientific Workflows, Semantic Annotation, Workflow Retrieval},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2017/10/PID4958635.pdf},
publisher = {IEEE},
title = {WorkflowHunt: combining keyword and semantic search in scientific workflow repositories},
year = {2017}
}
Scientific datasets, and the experiments that analyze them are growing in size and complexity, and scientists are facing difficulties to share such resources. Some initiatives have emerged to try to solve this problem. One of them involves the use of scientific workflows to represent and enact experiment execution. There is an increasing number of workflows that are potentially relevant for more than one scientific domain. However, it is hard to find workflows suitable for reuse given an experiment. Creating a workflow takes time and resources, and their reuse helps scientists to build new workflows faster and in a more reliable way. Search mechanisms in workflow repositories should provide different options for workflow discovery, but it is difficult for generic repositories to provide multiple mechanisms. This paper presents WorkflowHunt, a hybrid architecture for workflow search and discovery for generic repositories, which combines keyword and semantic search to allow finding relevant workflows using different search methods. We validated our architecture creating a prototype that uses real workflows and metadata from myExperiment, and compare search results via WorkflowHunt and via myExperiment’s search interface.
|
Saraiva, Márcio de Carvalho;
Medeiros, Claudia Bauzer
Finding out Topics in Educational Materials Using their Components (conference)
Proceedings of THE 47th Annual Frontiers in Education (FIE) Conference, October 18-21, 2017,
Indianapolis, Indiana, USA,
2017.
(
Abstract |
Links |
BibTeX |
Tags:
Components, Content analysis and feature selection, educational material, Information extraction, Topics Classification
)
@conference{deSaraiva2017,
abstract = {The Web is witnessing an exponential growth of distributed and heterogeneous educational material. This hampers distinguishing among contents of these materials, as well as their retrieval. While information retrieval and classification mechanisms concentrate on corpus analysis, annotation approaches either target specific formats or require that a document follows interoperable standards. Rather than target only textual characteristics, our strategy is mainly based on components of educational material. The header, body, footer and numbering of slides and progress bar are examples of components of slides and videos. Though our work is general purpose, it is being tested against slides and videos from Coursera, a web platform that provides universal access to online education material and courses from universities and organizations around the world.},
address = {Indianapolis, Indiana, USA},
author = {Márcio de Carvalho Saraiva and Claudia Bauzer Medeiros},
booktitle = {Proceedings of THE 47th Annual Frontiers in Education (FIE) Conference, October 18-21, 2017},
date = {2017-10-17},
keyword = {Components, Content analysis and feature selection, educational material, Information extraction, Topics Classification},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2017/07/ArtigoFIE2017-MarcioSaraiva.pdf},
organization = {IEEE},
title = {Finding out Topics in Educational Materials Using their Components},
year = {2017}
}
The Web is witnessing an exponential growth of distributed and heterogeneous educational material. This hampers distinguishing among contents of these materials, as well as their retrieval. While information retrieval and classification mechanisms concentrate on corpus analysis, annotation approaches either target specific formats or require that a document follows interoperable standards. Rather than target only textual characteristics, our strategy is mainly based on components of educational material. The header, body, footer and numbering of slides and progress bar are examples of components of slides and videos. Though our work is general purpose, it is being tested against slides and videos from Coursera, a web platform that provides universal access to online education material and courses from universities and organizations around the world.
|
Daltio, Jaudete
Views over Graph Databases: A Multifocus Approach for Heterogeneous Data (phdthesis)
University of Campinas - Institute of Computing,
phdthesis,
2017.
(
Abstract |
Links |
BibTeX |
Tags:
graph database,
PhDThesis,
Views,
Multifocus
)
@phdthesis{Daltio2017,
abstract = {Scientific research has become data-intensive and data-dependent. This new research paradigm requires sophisticated computer science techniques and technologies to support the life cycle of scientific data and collaboration among scientists from distinct areas. A major requirement is that researchers working in data-intensive interdisciplinary teams demand construction of multiple perspectives of the world, built over the same datasets. Present solutions cover a wide range of aspects, from the design of interoperability standards to the use of non-relational database management systems. None of these efforts, however, adequately meet the needs of multiple perspectives, which are called foci in the thesis. Basically, a focus is designed/built to cater to a research group (even within a single project) that needs to deal with a subset of data of interest, under multiple aggregation/generalization levels. The definition and creation of a focus are complex tasks that require mechanisms and engines to manipulate multiple representations of the same real world phenomenon. This PhD research aims to provide multiple foci over heterogeneous data. To meet this challenge, we deal with four research problems. The first two were (1) choosing an appropriate data management paradigm; and (2) eliciting multifocus requirements. Our work towards solving these problems made as choose graph databases to answer (1) and the concept of views in relational databases for (2). However, there is no consensual data model for graph databases and views are seldom discussed in this context. Thus, research problems (3) and (4) are: (3) specifying an adequate graph data model and (4) defining a framework to handle views on graph databases. Our research in these problems results in the main contributions of this thesis: (i) to present the case for the use of graph databases in multifocus research as persistence layer -- a schemaless and relationship driven type of database that provides a full understanding of data connections; (ii) to define views for graph databases to support the need for multiple foci, considering graph data manipulation, graph algorithms and traversal tasks; (iii) to propose a property graph data model (PGDM) to fill the gap of absence of a full-fledged data model for graphs; (iv) to specify and implement a framework, named Graph-Kaleidoscope, that supports views over graph databases and (v) to validate our framework for real world applications in two domains -- biodiversity and environmental resources -- typical examples of multidisciplinary research that involve the analysis of interactions of phenomena using heterogeneous data.},
author = {Jaudete Daltio},
date = {2017-09-12},
keyword = {graph database; PhDThesis; Views; Multifocus},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2017/12/jaudete_daltio_tese.pdf},
school = {University of Campinas - Institute of Computing},
title = {Views over Graph Databases: A Multifocus Approach for Heterogeneous Data},
year = {2017}
}
Scientific research has become data-intensive and data-dependent. This new research paradigm requires sophisticated computer science techniques and technologies to support the life cycle of scientific data and collaboration among scientists from distinct areas. A major requirement is that researchers working in data-intensive interdisciplinary teams demand construction of multiple perspectives of the world, built over the same datasets. Present solutions cover a wide range of aspects, from the design of interoperability standards to the use of non-relational database management systems. None of these efforts, however, adequately meet the needs of multiple perspectives, which are called foci in the thesis. Basically, a focus is designed/built to cater to a research group (even within a single project) that needs to deal with a subset of data of interest, under multiple aggregation/generalization levels. The definition and creation of a focus are complex tasks that require mechanisms and engines to manipulate multiple representations of the same real world phenomenon. This PhD research aims to provide multiple foci over heterogeneous data. To meet this challenge, we deal with four research problems. The first two were (1) choosing an appropriate data management paradigm; and (2) eliciting multifocus requirements. Our work towards solving these problems made as choose graph databases to answer (1) and the concept of views in relational databases for (2). However, there is no consensual data model for graph databases and views are seldom discussed in this context. Thus, research problems (3) and (4) are: (3) specifying an adequate graph data model and (4) defining a framework to handle views on graph databases. Our research in these problems results in the main contributions of this thesis: (i) to present the case for the use of graph databases in multifocus research as persistence layer -- a schemaless and relationship driven type of database that provides a full understanding of data connections; (ii) to define views for graph databases to support the need for multiple foci, considering graph data manipulation, graph algorithms and traversal tasks; (iii) to propose a property graph data model (PGDM) to fill the gap of absence of a full-fledged data model for graphs; (iv) to specify and implement a framework, named Graph-Kaleidoscope, that supports views over graph databases and (v) to validate our framework for real world applications in two domains -- biodiversity and environmental resources -- typical examples of multidisciplinary research that involve the analysis of interactions of phenomena using heterogeneous data.
|
Destro, Juliana Medeiros;
dos Reis, Julio Cesar;
Carvalho, Ariadne Maria Brito Rizzoni;
Ricarte, Ivan Luiz Marques
Experimental studies for revealing key factors of cross-language ontology alignments (conference)
Brazilian Ontology Research Seminar (ONTOBRAS 2017),
CEUR-WS,
2017.
(
Abstract |
Links |
BibTeX |
Tags:
)
@inproceedings{destro2017e,
title={Experimental studies for revealing key factors of cross-language ontology alignments},
author={Destro, Juliana Medeiros and dos Reis, Julio Cesar and Carvalho, Ariadne Maria Brito Rizzoni and Ricarte, Ivan Luiz Marques},
booktitle={Brazilian Ontology Research Seminar (ONTOBRAS 2017)},
year={2017}
}
Cross-language alignment between ontologies is relevant for the interoperability of systems in specific domains, such as in the life science domain. Although the literature has proposed techniques for the alignment of ontologies described in different languages, the influence of linguistic characteristics from domain-specific ontologies on such alignments has barely been appraised. This study proposes a series of experiments based on real-world mappings to understand the matching between ontologies in different languages. It investigates the role of a pivot-language related to the domain for the purpose of a fully automatic cross-language alignment. In particular, we analyse the influence of syntactic and semantic similarity methods and the structure of terms denoting concepts in ontologies. Experimental results, focused on the life science domain, indicate useful factors to take into account in the design of matching algorithms for domain-specific cross-language alignment.
|
Dos Reis, Julio Cesar;
Bonacin, Rodrigo;
Baranauskas, Maria Cecilia Calani
Recognizing Intentions in Free Text Messages: Studies with Portuguese Language (conference)
2017 IEEE 26th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE),
IEEE,
2017.
(
Abstract |
Links |
BibTeX |
Tags:
Illocutions,
Intention,
Pragmatics
)
@INPROCEEDINGS{dosReis2017r,
author={J. C. {Dos Reis} and R. {Bonacin} and M. C. {Calani Baranauskas}},
booktitle={2017 IEEE 26th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE)},
title={Recognizing Intentions in Free Text Messages: Studies with Portuguese Language},
year={2017},
volume={},
number={},
pages={302-307}
}
Recent literature indicates that user intention analysis brings benefits for several computational tasks including information retrieval and communication. However, intentions are expressed implicitly in natural language texts. Domain related specificities and cultural language aspects hamper their machine representation and interpretation. This requires thorough investigations of intention recognition methods in free text to permit further exploring them. In this paper, we propose a technique based on the matching with representative key phrases and semantic extension of terms to detect instances of intention classes in natural language sentences. We explore a multidimensional framework of illocution categorization to structure the distinct intention classes. The conducted experiments with Portuguese language datasets of different characteristics reveal the potentialities of our method when analyzing the outcome of state-ofthe- art machine-learning based text-mining techniques.
|
Tacioli, Leandro;
Toledo, Luís Felipe;
Medeiros, Claudia Bauzer
An Architecture for Animal Sound Identification based on Multiple Feature Extraction and Classification Algorithms (conference)
11th BreSci - Brazilian e-Science Workshop,
Sociedade Brasileira de Computação (SBC),
2017.
(
Abstract |
Links |
BibTeX |
Tags:
eScience, Feature Extraction, Pattern recognition
)
@conference{Tacioli2017,
abstract = {Automatic identification of animals is extremely useful for scientists, providing ways to monitor species and changes in ecological communities. The choice of effective audio features and classification techniques is a challenge on any audio recognition system, especially in bioacoustics that commonly uses several algorithms. This paper presents a novel software architecture that supports multiple feature extraction and classification algorithms to help on the identification of animal species from their recorded sounds. This architecture was implemented by the WASIS software, freely available on the Web.},
author = {Leandro Tacioli and Luís Felipe Toledo and Claudia Bauzer Medeiros},
booktitle = {11th BreSci - Brazilian e-Science Workshop},
date = {2017-07-06},
keyword = {eScience, Feature Extraction, Pattern recognition},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2017/06/Tacioli-BreSci2017.pdf},
publisher = {Sociedade Brasileira de Computação (SBC)},
title = {An Architecture for Animal Sound Identification based on Multiple Feature Extraction and Classification Algorithms},
year = {2017}
}
Automatic identification of animals is extremely useful for scientists, providing ways to monitor species and changes in ecological communities. The choice of effective audio features and classification techniques is a challenge on any audio recognition system, especially in bioacoustics that commonly uses several algorithms. This paper presents a novel software architecture that supports multiple feature extraction and classification algorithms to help on the identification of animal species from their recorded sounds. This architecture was implemented by the WASIS software, freely available on the Web.
|
Carvalho, Lucas A. M. C.;
Malaverri, Joana E. Gonzales;
Medeiros, Claudia Bauzer
Implementing W2Share: Supporting Reproducibility and Quality Assessment in eScience (conference)
11th BreSci - Brazilian e-Science Workshop,
Sociedade Brasileira de Computação (SBC),
2017.
(
Abstract |
Links |
BibTeX |
Tags:
Data quality, Provenance Information, Scientific Workflows, Semantic Annotation, W2Share framework
)
@conference{Carvalho2017,
abstract = {An open problem in scientific community is that of supporting reproducibility and quality assessment of scientific experiments. Solutions need to
be able to help scientists to reproduce experimental procedures in a reliable manner and, at the same time, to provide mechanisms for documenting the experiments to enhance integrity and transparency. Moreover, solutions need to incorporate features that allow the assessment of procedures, data used and results of those experiments. In this context, we designed W2Share, a framework to meet these requirements. This paper introduces our first implementation of W2Share, which moreover guides scientists in step-by-step process to ensure reproducibility based on a script-to-workflow conversion strategy. W2Share also incorporates features that allow annotating experiments with quality information. We validate our prototype using a real-world scenario in Bioinformatics.},
author = {Lucas A. M. C. Carvalho and Joana E. Gonzales Malaverri and Claudia Bauzer Medeiros},
booktitle = {11th BreSci - Brazilian e-Science Workshop},
date = {2017-07-06},
keyword = {Data quality, Provenance Information, Scientific Workflows, Semantic Annotation, W2Share framework},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2017/05/w2share-bresci2017-camera-ready.pdf},
pages = {1-8},
publisher = {Sociedade Brasileira de Computação (SBC)},
title = {Implementing W2Share: Supporting Reproducibility and Quality Assessment in eScience},
year = {2017}
}
An open problem in scientific community is that of supporting reproducibility and quality assessment of scientific experiments. Solutions need to be able to help scientists to reproduce experimental procedures in a reliable manner and, at the same time, to provide mechanisms for documenting the experiments to enhance integrity and transparency. Moreover, solutions need to incorporate features that allow the assessment of procedures, data used and results of those experiments. In this context, we designed W2Share, a framework to meet these requirements. This paper introduces our first implementation of W2Share, which moreover guides scientists in step-by-step process to ensure reproducibility based on a script-to-workflow conversion strategy. W2Share also incorporates features that allow annotating experiments with quality information. We validate our prototype using a real-world scenario in Bioinformatics.
|
Tacioli, Leandro
WASIS - Bioacoustic Species Identification based on Multiple Feature Extraction and Classification Algorithms (mastersthesis)
Universidade Estadual de Campinas - UNICAMP,
mastersthesis,
2017.
(
Abstract |
Links |
BibTeX |
Tags:
Animals - Identification, Bioacoustics, Computer systems, Pattern recognition
)
@mastersthesis{Tacioli2017b,
abstract = {Automatic identification of animal species based on their sounds is one of the means to conduct research in bioacoustics. This research domain provides, for instance, ways to monitor rare and endangered species, to analyze changes in ecological communities, or ways to study the social meaning of animal calls in their behavioral contexts. Identification mechanisms are typically executed in two stages: feature extraction and classification. Both stages present challenges, in computer science and in bioacoustics. The choice of effective feature extraction and classification algorithms is a challenge on any audio recognition system, especially in bioacoustics. Considering the wide variety of animal groups studied, algorithms are tailored to specific groups. Audio classification techniques are also sensitive to the extracted features, and conditions surrounding the recordings. As a results, most bioacoustic softwares are not extensible, therefore limiting the kinds of recognition experiments that can be conducted. Given this scenario, this dissertation proposes a software architecture that allows multiple feature extraction, feature fusion and classification algorithms to support scientists and the general public on the identification of animal species through their recorded sounds. This architecture was implemented by the WASIS software, freely available on the Web. Since WASIS is open-source and expansible, experts can perform experiments with many combinations of pairs descriptor-classifier to choose the most appropriate ones for the identification of given animal sub-groups. A number of algorithms were implemented, serving as the basis for a comparative study that recommends sets of feature extraction and classification algorithms for three animal groups.},
author = {Leandro Tacioli},
date = {2017-07-03},
keyword = {Animals - Identification, Bioacoustics, Computer systems, Pattern recognition},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2017/07/LeandroTacioli-Mestrado.pdf},
school = {Universidade Estadual de Campinas - UNICAMP},
title = {WASIS - Bioacoustic Species Identification based on Multiple Feature Extraction and Classification Algorithms},
year = {2017}
}
Automatic identification of animal species based on their sounds is one of the means to conduct research in bioacoustics. This research domain provides, for instance, ways to monitor rare and endangered species, to analyze changes in ecological communities, or ways to study the social meaning of animal calls in their behavioral contexts. Identification mechanisms are typically executed in two stages: feature extraction and classification. Both stages present challenges, in computer science and in bioacoustics. The choice of effective feature extraction and classification algorithms is a challenge on any audio recognition system, especially in bioacoustics. Considering the wide variety of animal groups studied, algorithms are tailored to specific groups. Audio classification techniques are also sensitive to the extracted features, and conditions surrounding the recordings. As a results, most bioacoustic softwares are not extensible, therefore limiting the kinds of recognition experiments that can be conducted. Given this scenario, this dissertation proposes a software architecture that allows multiple feature extraction, feature fusion and classification algorithms to support scientists and the general public on the identification of animal species through their recorded sounds. This architecture was implemented by the WASIS software, freely available on the Web. Since WASIS is open-source and expansible, experts can perform experiments with many combinations of pairs descriptor-classifier to choose the most appropriate ones for the identification of given animal sub-groups. A number of algorithms were implemented, serving as the basis for a comparative study that recommends sets of feature extraction and classification algorithms for three animal groups.
|
Dos Reis, Julio Cesar;
Jensen, Cristiane Josely;
Bonacin, Rodrigo;
Hornung, Heiko;
Calani Baranauskas, Maria Cecília
Participatory Icons Specification for Expressing Intentions in Computer-Mediated Communications (conference)
Enterprise Information Systems,
Springer,
2017.
(
Abstract |
Links |
BibTeX |
Tags:
Icons,
Emoticons,
Meanings,
Intentions,
Pragmatics,
Communication,
HCI
)
@InProceedings{dosReis2017p,
author="Dos Reis, Julio Cesar
and Jensen, Cristiane Josely
and Bonacin, Rodrigo
and Hornung, Heiko
and Calani Baranauskas, Maria Cec{\'i}lia",
editor="Hammoudi, Slimane
and Maciaszek, Leszek A.
and Missikoff, Michele M.
and Camp, Olivier
and Cordeiro, Jos{\'e}",
title="Participatory Icons Specification for Expressing Intentions in Computer-Mediated Communications",
booktitle="Enterprise Information Systems",
year="2017",
publisher="Springer International Publishing",
address="Cham",
pages="414--435",
abstract="Web-mediated conversations require treating intentions more explicitly. Literature lacks adequate design methods and interactive mechanisms to support users in the sharing of intentions. This research assumes that icons representing emotions play a central role as means for aiding users to convey intentions in communication tasks. This article proposes a method to specify emoticons for representing the users' intentions, named ``intenticons''. The work explores Speech Act Theory and Semiotics in a conceptual framework to structure classes of intentions. We conduct participatory activities to experiment the method with 40 users. The obtained intenticons were evaluated with a different set of users to reveal their effectiveness. The obtained results suggest the feasibility of the method to select and enhance emoticons for intention expression. Evaluations point out that most of the achieved intenticons indicate an acceptable degree of representativeness for the intention classes.",
isbn="978-3-319-62386-3"
}
Web-mediated conversations require treating intentions more explicitly. Literature lacks adequate design methods and interactive mechanisms to support users in the sharing of intentions. This research assumes that icons representing emotions play a central role as means for aiding users to convey intentions in communication tasks. This article proposes a method to specify emoticons for representing the users' intentions, named ``intenticons''. The work explores Speech Act Theory and Semiotics in a conceptual framework to structure classes of intentions. We conduct participatory activities to experiment the method with 40 users. The obtained intenticons were evaluated with a different set of users to reveal their effectiveness. The obtained results suggest the feasibility of the method to select and enhance emoticons for intention expression. Evaluations point out that most of the achieved intenticons indicate an acceptable degree of representativeness for the intention classes.
|
Filho, Francisco José Nardi
Hybrid Narrative and Clinical Knowledge Base for Emergency Medicine Training (mastersthesis)
Universidade Estadual de Campinas - UNICAMP,
mastersthesis,
2017.
(
Abstract |
Links |
BibTeX |
Tags:
Artificial intelligence - Medical applications, Emergency medicine, Expert systems (Computer science), Information storage and retrieval systems, Information systems
)
@mastersthesis{NardiFilho2017,
abstract = {Software for medical training usually follows two types of approaches for the representation of its data. One type is the software of simulation-based training or virtual patients - which have highly structured representations of the clinical data and simulation plans. Another type is some systems which focus on the narrative of a clinical case in free-text format - e.g., the Jacinto emergency medicine learning environment. In this case, the clinical data mixes with the narrative in unstructured format. Thus, we propose a model for a hybrid narrative and clinical knowledge base for emergency medicine training that combines both of the approaches. We hypothesize that by connecting narratives with structured clinical information, we can take advantage of the strongest points of each approach. On the one hand, structured clinical data offers flexibility for the production of case variations and alternative plans, which gives machine more autonomy to assess user performance. On the other hand, free-text narratives enable the introduction of real scenario relevant aspects and context, beyond clinical data. In this work, we present a practical experiment involving the database of the Jacinto emergency medicine learning environment.},
author = {Francisco José Nardi Filho},
date = {2017-05-19},
keyword = {Artificial intelligence - Medical applications, Emergency medicine, Expert systems (Computer science), Information storage and retrieval systems, Information systems},
link = {http://www.repositorio.unicamp.br/bitstream/REPOSIP/325067/1/Nardi%20Filho_FranciscoJose_M.pdf},
school = {Universidade Estadual de Campinas - UNICAMP},
title = {Hybrid Narrative and Clinical Knowledge Base for Emergency Medicine Training},
year = {2017}
}
Software for medical training usually follows two types of approaches for the representation of its data. One type is the software of simulation-based training or virtual patients - which have highly structured representations of the clinical data and simulation plans. Another type is some systems which focus on the narrative of a clinical case in free-text format - e.g., the Jacinto emergency medicine learning environment. In this case, the clinical data mixes with the narrative in unstructured format. Thus, we propose a model for a hybrid narrative and clinical knowledge base for emergency medicine training that combines both of the approaches. We hypothesize that by connecting narratives with structured clinical information, we can take advantage of the strongest points of each approach. On the one hand, structured clinical data offers flexibility for the production of case variations and alternative plans, which gives machine more autonomy to assess user performance. On the other hand, free-text narratives enable the introduction of real scenario relevant aspects and context, beyond clinical data. In this work, we present a practical experiment involving the database of the Jacinto emergency medicine learning environment.
|
de Araújo, Ricardo José;
dos Reis, Julio Cesar;
Bonacin, Rodrigo
Colors Similarity Computation for User Interface Adaptation (conference)
International Conference on Universal Access in Human-Computer Interaction,
Springer,
2017.
(
Abstract |
Links |
BibTeX |
Tags:
Accessibility,
Color blindness,
Interface adaptation,
Color similarity
)
@ARTICLE{deAraujo2017333,
author={de Araújo, R.J. and dos Reis, J.C. and Bonacin, R.},
title={Colors similarity computation for user interface adaptation},
journal={Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)},
year={2017},
volume={10277 LNCS},
pages={333-345},
doi={10.1007/978-3-319-58706-6_27},
url={https://www.scopus.com/inward/record.uri?eid=2-s2.0-85025145600&doi=10.1007%2f978-3-319-58706-6_27&partnerID=40&md5=69b24b51ec60b56bc0da373c245d0f5f},
abstract={Color blind people face various difficulties interacting with web systems. Interface adaptation techniques designed to recoloring images and web interfaces may deal with several color blindness visualization issues. However, different situations, preferences and individual needs make complex choosing the most suitable recoloring technique. This article proposes an original algorithm to compute similarity between colors. We aim to support the decision process of select the most suitable adaptation technique according to the type of color blindness and interaction context. The algorithm ponders arguments for taking the users’ preferences and limitations into account. Our experimental analysis implement various configurations by testing the weights in the color distance calculation according to the colorblindness type. The obtained results reveal the advantages of considering the type of colorblindness in the color similarity computation. © Springer International Publishing AG 2017.},
author_keywords={Accessibility; Color blindness; Color similarity; Interface adaptation},
sponsors={},
publisher={Springer Verlag},
document_type={Conference Paper},
source={Scopus},
}
Color blind people face various difficulties interacting with web systems. Interface adaptation techniques designed to recoloring images and web interfaces may deal with several color blindness visualization issues. However, different situations, preferences and individual needs make complex choosing the most suitable recoloring technique. This article proposes an original algorithm to compute similarity between colors. We aim to support the decision process of select the most suitable adaptation technique according to the type of color blindness and interaction context. The algorithm ponders arguments for taking the users’ preferences and limitations into account. Our experimental analysis implement various configurations by testing the weights in the color distance calculation according to the colorblindness type. The obtained results reveal the advantages of considering the type of colorblindness in the color similarity computation.
|
Gonçalves, F.M.;
Duarte, E.F.;
Dos Reis, J.C.;
Baranauskas, M.C.C.
An analysis of online discussion platforms for academic deliberation support (conference)
International Conference on Social Computing and Social Media,
Springer,
2017.
(
Abstract |
Links |
BibTeX |
Tags:
Academic deliberation,
Collaboration,
Considerate,
Debate hub,
HCI,
Interaction design,
Social computing,
Trello
)
@ARTICLE{Gonçalves201791,
author={Gonçalves, F.M. and Duarte, E.F. and Dos Reis, J.C. and Baranauskas, M.C.C.},
title={An analysis of online discussion platforms for academic deliberation support},
journal={Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)},
year={2017},
volume={10283 LNCS},
pages={91-109},
doi={10.1007/978-3-319-58562-8_8},
url={https://www.scopus.com/inward/record.uri?eid=2-s2.0-85025124946&doi=10.1007%2f978-3-319-58562-8_8&partnerID=40&md5=65d4ad5eb5570cd951773bc6d72565ca},
abstract={Asynchronous online discussions are relevant for supporting and promoting debates among people. Nevertheless, achieving beneficial discussion requires adequate software applications with specific features to support people’s participation, e.g., mechanisms for structured pros and cons arguments. Although literature is vast in discussing online forums usage, requirements for the design of platforms for academic deliberation has not been addressed in the same proportion. In this paper, we analyze three online discussion platforms for deliberation. We conduct a structural analysis regarding their interaction concepts and, based on activities of graduate students attending a Human-Computer Interaction discipline, this study conducts a usage analysis of the platforms. Results reveal the level of participants’ engagement in academic discussions and the effects on their learning perception. Moreover, results expose the impact of software design choices in the deliberation outcome. © Springer International Publishing AG 2017.},
author_keywords={Academic deliberation; Collaboration; Considerate; Debate hub; HCI; Interaction design; Social computing; Trello},
sponsors={},
publisher={Springer Verlag},
document_type={Conference Paper},
source={Scopus},
}
Asynchronous online discussions are relevant for supporting and promoting debates among people. Nevertheless, achieving beneficial discussion requires adequate software applications with specific features to support people’s participation, e.g., mechanisms for structured pros and cons arguments. Although literature is vast in discussing online forums usage, requirements for the design of platforms for academic deliberation has not been addressed in the same proportion. In this paper, we analyze three online discussion platforms for deliberation. We conduct a structural analysis regarding their interaction concepts and, based on activities of graduate students attending a Human-Computer Interaction discipline, this study conducts a usage analysis of the platforms. Results reveal the level of participants’ engagement in academic discussions and the effects on their learning perception. Moreover, results expose the impact of software design choices in the deliberation outcome.
|
Destro, Juliana Medeiros;
Reis, Julio Cesar dos;
Brito, Ariadne Maria;
Carvalho, Rizzoni;
Ricarte, Ivan Luiz Marques
Influence of Semantic Similarity Measures on Ontology Cross-Language Mappings (conference)
Proceedings of the Symposium on Applied Computing,
ACM,
2017.
(
Abstract |
Links |
BibTeX |
Tags:
biomedical ontologies,
cross-language matching,
semantic similarity,
ontologies,
ontology mapping
)
@inproceedings{destro2017i,
author = {Destro, Juliana Medeiros and Reis, Julio Cesar dos and Brito, Ariadne Maria and Carvalho, Rizzoni and Ricarte, Ivan Luiz Marques},
title = {Influence of Semantic Similarity Measures on Ontology Cross-Language Mappings},
year = {2017},
isbn = {9781450344869},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3019612.3019836},
doi = {10.1145/3019612.3019836},
booktitle = {Proceedings of the Symposium on Applied Computing},
pages = {323–329},
numpages = {7},
keywords = {biomedical ontologies, cross-language matching, semantic similarity, ontologies, ontology mapping},
location = {Marrakech, Morocco},
series = {SAC ’17}
}
Cross-language mappings establish relations between ontology concepts defined in different languages. Similarity measures calculate the degree of relatedness between concepts to support matching between two distinct ontologies. Cross-language matching remains an open research issue due to the difficulties in taking advantage of similarity computation. This article investigates the effects of different semantic similarity measures on the identification of cross-language mappings. We carry out experiments exploring real-world biomedical ontology mappings to comprehend the behaviour of computed similarity values. The obtained results indicate the relevance of the domain-related background knowledge in the effectiveness of semantic measures for ontology cross-language alignment.
|
2016 |
Pantoja, Fagner L.;
Cavoto, Patrícia;
Reis, Julio Cesar dos;
Santanchè, André
Generating Knowledge Networks from Phenotypic Descriptions (conference)
Proceedings of the 12th International Conference on eScience,
Baltimore, MD, USA,
2016.
(
Links |
BibTeX |
Tags:
Curation
)
@conference{Pantoja,
address = {Baltimore, MD, USA},
author = {Fagner L. Pantoja and Patrícia Cavoto and Julio Cesar dos Reis and André Santanchè},
booktitle = {Proceedings of the 12th International Conference on eScience},
date = {2016-10-24},
keyword = {Curation},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/12/1095anav.pdf},
organization = {IEEE,},
title = {Generating Knowledge Networks from Phenotypic Descriptions},
year = {2016}
}
|
Carvalho, Lucas A. M. C.;
Belhajjame, Khalid;
Medeiros, Claudia Bauzer
Converting Scripts into Reproducible Workflow Research Objects (conference)
Proceedings of the 2016 IEEE 12th International Conference on eScience,
IEEE,
Baltimore, MD, USA, October 23-27,
2016.
(
Abstract |
Links |
BibTeX |
Tags:
Methodology, Provenance Information, Scientific Workflows
)
@conference{Carvalho2016Converting,
abstract = {Scientific discovery and analysis are increasingly computational and data-driven. While scripting languages, such as Python, R and Perl, are the means of choice of the majority of scientists to encode and run their data analysis, scripts are generally not amenable to reuse or reproducibility. Scripts do rarely get reused or even shared with third party scientists. We argue in this paper that the reproducibility of scripts can be promoted by converting them into workflow research objects. A workflow research object encodes a script into a production (executable) workflow that is accompanied by annotations, example datasets and provenance traces of their execution, thereby allowing third party users to understand the data analysis encoded by the original script, run the associated workflow using the same or different dataset, or even repurpose it for a different analysis. To this end, we present a methodology for converting scripts into workflow research objects in a principled manner, guided by requirements that we elicited for this purpose. The methodology exploits tools and standards that have been developed by the community, in particular YesWorkflow, Research Objects and the W3C PROV. It is showcased using a real world use case from the field of Molecular Dynamics.},
address = {Baltimore, MD, USA, October 23-27},
author = {Lucas A. M. C. Carvalho and Khalid Belhajjame and Claudia Bauzer Medeiros},
booktitle = {Proceedings of the 2016 IEEE 12th International Conference on eScience},
date = {2016-10-23},
keyword = {Methodology, Provenance Information, Scientific Workflows},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/08/converting-scripts-reproducible-camera-ready.pdf},
publisher = {IEEE},
title = {Converting Scripts into Reproducible Workflow Research Objects},
year = {2016}
}
Scientific discovery and analysis are increasingly computational and data-driven. While scripting languages, such as Python, R and Perl, are the means of choice of the majority of scientists to encode and run their data analysis, scripts are generally not amenable to reuse or reproducibility. Scripts do rarely get reused or even shared with third party scientists. We argue in this paper that the reproducibility of scripts can be promoted by converting them into workflow research objects. A workflow research object encodes a script into a production (executable) workflow that is accompanied by annotations, example datasets and provenance traces of their execution, thereby allowing third party users to understand the data analysis encoded by the original script, run the associated workflow using the same or different dataset, or even repurpose it for a different analysis. To this end, we present a methodology for converting scripts into workflow research objects in a principled manner, guided by requirements that we elicited for this purpose. The methodology exploits tools and standards that have been developed by the community, in particular YesWorkflow, Research Objects and the W3C PROV. It is showcased using a real world use case from the field of Molecular Dynamics.
|
Saraiva, Márcio de Carvalho;
Medeiros, Claudia Bauzer
Use of graphs and taxonomic classifications to analyze content relationships among courseware (conference)
Proceedings of the 31st Brazilian Symposium on Databases,
Salvador, Bahia, Brazil, October 4-7,
2016.
(
Abstract |
Links |
BibTeX |
Tags:
content analysis, educational material, graph database, multiple relationships
)
@conference{Saraiva2016Short,
abstract = {The search for educational content in courseware repositories is laborious and time consuming. There is an abundance of such repositories, and research efforts to facilitate search, but access is guided by keywords and/or terms selected by courseware authors, thus lacking flexibility. The goal of this project is to design and develop a suite of tools to assist users to find, analyze and select pieces of educational content that are relevant to their learning goals. Contributions will be both at the algorithm and software design level, and at the user (application) level.},
address = {Salvador, Bahia, Brazil, October 4-7},
author = {Márcio de Carvalho Saraiva and Claudia Bauzer Medeiros},
booktitle = {Proceedings of the 31st Brazilian Symposium on Databases},
date = {2016-10-04},
issn = {2316-5170},
keyword = {content analysis, educational material, graph database, multiple relationships},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2017/01/marciosaraiva-sbbd2016.pdf},
organization = {Sociedade Brasileira de Computação},
pages = {265-270},
title = {Use of graphs and taxonomic classifications to analyze content relationships among courseware},
year = {2016}
}
The search for educational content in courseware repositories is laborious and time consuming. There is an abundance of such repositories, and research efforts to facilitate search, but access is guided by keywords and/or terms selected by courseware authors, thus lacking flexibility. The goal of this project is to design and develop a suite of tools to assist users to find, analyze and select pieces of educational content that are relevant to their learning goals. Contributions will be both at the algorithm and software design level, and at the user (application) level.
|
Carvalho, Lucas A. M. C.;
Medeiros, Claudia Bauzer
Provenance-Based Infrastructure to Support Reuse of Computational Experiments (conference)
Proceedings of the Satellite Events of the 31st Brazilian Symposium on Databases (Thesis and Dissertations Workshop),
Salvador, Bahia, Brazil, October 4-7,
978-85-7669-343-7,
2016.
(
Abstract |
Links |
BibTeX |
Tags:
Provenance Information, Scientific Workflows, Semantic Annotation, Workflow Retrieval
)
@conference{Carvalho2016Wtd,
abstract = {One recurrent problem in multidisciplinary research is finding reusable objects (e.g., scripts, code, documents, workflows) that can be used across disciplines to enhance collaboration.
This paper presents our ongoing work taking advantage of provenance information, combined with scientific workflows, to help find such objects. We also present challenges posed by provenance-based retrieval, which we propose as a solution for transdisciplinary scientific collaboration via reuse.
Our case study in molecular dynamics experiments is part of a larger multi-scale experimental scenario that requires cooperation involving scientists from different disciplines.},
address = {Salvador, Bahia, Brazil, October 4-7},
author = {Lucas A. M. C. Carvalho and Claudia Bauzer Medeiros},
booktitle = {Proceedings of the Satellite Events of the 31st Brazilian Symposium on Databases (Thesis and Dissertations Workshop)},
date = {2016-10-04},
isbn = {978-85-7669-343-7},
keyword = {Provenance Information, Scientific Workflows, Semantic Annotation, Workflow Retrieval},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2017/01/wtdbd2016-camera-ready.pdf},
organization = {Sociedade Brasileira de Computação},
pages = {74-81},
title = {Provenance-Based Infrastructure to Support Reuse of Computational Experiments},
year = {2016}
}
One recurrent problem in multidisciplinary research is finding reusable objects (e.g., scripts, code, documents, workflows) that can be used across disciplines to enhance collaboration. This paper presents our ongoing work taking advantage of provenance information, combined with scientific workflows, to help find such objects. We also present challenges posed by provenance-based retrieval, which we propose as a solution for transdisciplinary scientific collaboration via reuse. Our case study in molecular dynamics experiments is part of a larger multi-scale experimental scenario that requires cooperation involving scientists from different disciplines.
|
Pantoja, Fagner Leal
Generating Knowledge Networks from Phenotypic Descriptions (mastersthesis)
University of Campinas,
mastersthesis,
2016.
(
Links |
BibTeX |
Tags:
Curation
)
@mastersthesis{Pantoja2016,
author = {Fagner Leal Pantoja},
date = {2016-08-05},
keyword = {Curation},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/12/fagner-dissertacao-vfinal.pdf},
school = {University of Campinas},
title = {Generating Knowledge Networks from Phenotypic Descriptions},
year = {2016}
}
|
Borges, Luana Loubet
BioGraph: Linking Biological Bases Across Organisms (mastersthesis)
Universidade Estadual de Campinas - UNICAMP,
mastersthesis,
2016.
(
Abstract |
Links |
BibTeX |
Tags:
database, Ontologies (Information retrieval)
)
@mastersthesis{Borges2016,
abstract = {Representing data as networks have been shown to be a powerful approach for data analysis in biodiversity, e.g., interactions among organisms; relations among genes and phenotypes etc. In this context, databases and repositories following a graph model (e.g., RDF) have been increasingly used to interconnect information and to support network-driven analysis. Usually, this kind of analysis requires gathering together and linking data from several distinct and heterogeneous sources. In this work, we investigate this challenge in the context of biological bases focusing on the characterization of living organisms, especially their phenotypes and diseases. It includes the rich diversity of Model Organism Databases (MODs) -- repositories specialized in a particular taxon -- widely used in the biological and medical studies. We exploit a lightweight integration approach, inspired in the Linked Open Data initiative, mapping several biological bases in a unified graph database -- our BioGraph -- and linking key elements to offer an interconnected view over the data. We present here practical experiments to validate the proposal and to show how BioGraph can contribute for biological data analysis in a network perspective.},
author = {Luana Loubet Borges},
date = {2016-08-05},
keyword = {database, Ontologies (Information retrieval)},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/10/dissertacao_versao_final_Luana_Loubet.pdf},
school = {Universidade Estadual de Campinas - UNICAMP},
title = {BioGraph: Linking Biological Bases Across Organisms},
year = {2016}
}
Representing data as networks have been shown to be a powerful approach for data analysis in biodiversity, e.g., interactions among organisms; relations among genes and phenotypes etc. In this context, databases and repositories following a graph model (e.g., RDF) have been increasingly used to interconnect information and to support network-driven analysis. Usually, this kind of analysis requires gathering together and linking data from several distinct and heterogeneous sources. In this work, we investigate this challenge in the context of biological bases focusing on the characterization of living organisms, especially their phenotypes and diseases. It includes the rich diversity of Model Organism Databases (MODs) -- repositories specialized in a particular taxon -- widely used in the biological and medical studies. We exploit a lightweight integration approach, inspired in the Linked Open Data initiative, mapping several biological bases in a unified graph database -- our BioGraph -- and linking key elements to offer an interconnected view over the data. We present here practical experiments to validate the proposal and to show how BioGraph can contribute for biological data analysis in a network perspective.
|
Carvalho, Lucas A. M. C.;
Silveira, Rodrigo L.;
Pereira, Caroline S.;
Skaf, Munir S.;
Medeiros, Claudia Bauzer
Provenance-Based Retrieval: Fostering Reuse and Reproducibility Across Scientific Disciplines (conference)
Provenance and Annotation of Data and Processes (Proceedings of 6th International Provenance and Annotation Workshop - IPAW 2016),
Springer International Publishing,
McLean, Virginia, U.S.A.,
978-3319405926,
2016.
(
Abstract |
Links |
BibTeX |
Tags:
Provenance Information, Scientific Workflows, Semantic Annotation, Workflow Retrieval
)
@conference{Carvalho2016,
abstract = {When computational researchers from several domains cooperate, one recurrent problem is finding tools, methods and approaches that can be used across disciplines, to enhance collaboration through reuse. The
paper presents our ongoing work to meet the challenges posed by provenance-based retrieval, proposed as
a solution for transdisciplinary scientific collaboration via reuse of scientific workflows. Our work is based
upon a case study in molecular dynamics experiments, as part of a larger multi-scale experimental
scenario.},
address = {McLean, Virginia, U.S.A.},
author = {Lucas A. M. C. Carvalho and Rodrigo L. Silveira and Caroline S. Pereira and Munir S. Skaf and Claudia Bauzer Medeiros},
booktitle = {Provenance and Annotation of Data and Processes (Proceedings of 6th International Provenance and Annotation Workshop - IPAW 2016)},
chapter = {17},
date = {2016-06-06},
editor = {Marta Mattoso and Boris Glavic},
isbn = {978-3319405926},
keyword = {Provenance Information, Scientific Workflows, Semantic Annotation, Workflow Retrieval},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/05/ipaw2016-poster-cameraready.pdf},
pages = {1-4},
publisher = {Springer International Publishing},
series = {LNCS 9672},
title = {Provenance-Based Retrieval: Fostering Reuse and Reproducibility Across Scientific Disciplines},
year = {2016}
}
When computational researchers from several domains cooperate, one recurrent problem is finding tools, methods and approaches that can be used across disciplines, to enhance collaboration through reuse. The paper presents our ongoing work to meet the challenges posed by provenance-based retrieval, proposed as a solution for transdisciplinary scientific collaboration via reuse of scientific workflows. Our work is based upon a case study in molecular dynamics experiments, as part of a larger multi-scale experimental scenario.
|
Mota, Matheus Silva;
Reis, Julio Cesar dos;
Goutte, Sandra;
Santanchè, André
Multiscaling a Graph-based Dataspace [accepted] (article)
Journal of Information and Data Management - JIDM,
2016.
(
Abstract |
Links |
BibTeX |
Tags:
Journal Paper
)
@article{linkedscalesjidm,
abstract = {Biologists increasingly need a unified view to understand and discover relationships among data elements scattered along data sources with different levels of heterogeneity. Existing approaches usually adopt ad-hoc heavyweight integration strategies, requiring a costly upfront effort involving a monolithic chain of steps to handle specific formats/schemas, with low or no reuse. This article proposes the conception of a multiscale-based dataspace architecture, called LinkedScales. It departs from the notion of integration-scales within a dataspace, and defines a systematic and progressive integration process via graph-based transformations over a graph database. LinkedScales aims to provide a homogeneous view of heterogeneous sources, allowing systems to reach and produce different integration levels on demand, going from raw representations (lower scales) towards ontology-like structures (higher scales). We describe inner aspects of the architecture and its transformation process by introducing the Multiscale Transformation Graph, which tracks the transformation process among scales.
Although the proposed framework can be applied to several scenarios, this work focuses on the biology domain addressing the organism-centric analysis scenario. Obtained results reveal the viability of the framework and its implementation to integrate relevant resources for the organism-centric scenario.},
author = {Matheus Silva Mota and Julio Cesar dos Reis and Sandra Goutte and André Santanchè},
date = {2016-05-01},
journal = {Journal of Information and Data Management - JIDM},
keyword = {Journal Paper},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/10/multiscaling-graph-based.pdf},
pages = {16},
title = {Multiscaling a Graph-based Dataspace [accepted]},
year = {2016}
}
Biologists increasingly need a unified view to understand and discover relationships among data elements scattered along data sources with different levels of heterogeneity. Existing approaches usually adopt ad-hoc heavyweight integration strategies, requiring a costly upfront effort involving a monolithic chain of steps to handle specific formats/schemas, with low or no reuse. This article proposes the conception of a multiscale-based dataspace architecture, called LinkedScales. It departs from the notion of integration-scales within a dataspace, and defines a systematic and progressive integration process via graph-based transformations over a graph database. LinkedScales aims to provide a homogeneous view of heterogeneous sources, allowing systems to reach and produce different integration levels on demand, going from raw representations (lower scales) towards ontology-like structures (higher scales). We describe inner aspects of the architecture and its transformation process by introducing the Multiscale Transformation Graph, which tracks the transformation process among scales. Although the proposed framework can be applied to several scenarios, this work focuses on the biology domain addressing the organism-centric analysis scenario. Obtained results reveal the viability of the framework and its implementation to integrate relevant resources for the organism-centric scenario.
|
Gonçalves, Fabrício Matheus
Design de interação em sistemas computacionais para apoio à aprendizagem ativa: uma abordagem sistêmica (mastersthesis)
University of Campinas - Unicamp,
mastersthesis,
2016.
(
Abstract |
Links |
BibTeX |
Tags:
Active learning, Agile software development, Autonomy, Interação humano-computador, Peer instruction, Peer teaching Self-determination (Psychology), Semiótica organizacional
)
@mastersthesis{goncalves2016design,
abstract = {In formal learning contexts, there is a diversity of interests and skills than need to have space in the interaction between those involved in the production and sharing of knowledge. Active Learning refers to a set of strategies through which the participation of key actors in the educational environment goes beyond the unidirectional expository/receptive model of knowledge, involving activities where discussion and collaboration with others have an important role in reflection and construction of meaning. Computer systems that support the learning processes have not been designed for the diversity of skills and demands present in such environments. In this thesis we argue that if we want to develop solutions that make sense to stakeholders and suited to the complexity that characterizes Active Learning scenarios, we need to include them in the design cycle. In this paper we adopt a perspective based on Organizational Semiotics for analysis of Active Learning scenarios and propose a systemic vision for the interaction design in such environments, including motivational aspects. Work results include: a cyclical design process than we call "on the fly design", and a system for collaborative authoring and review of Active Learning activities. This process has been experimented on incremental construction of the system which, in turn, has been experimented in a real context of higher education. The system was evaluated iteratively, based on feedback from stakeholders in the situated context, feeding back the characterization of the process. The process was effective in building an emerging system for supporting the collaboration and participation of those involved in the experimented Active Learning scenarios.},
author = {Fabrício Matheus Gonçalves},
date = {2016-02-04},
keyword = {Active learning, Agile software development, Autonomy, Interação humano-computador, Peer instruction, Peer teaching Self-determination (Psychology), Semiótica organizacional},
link = {http://www.reposip.unicamp.br/xmlui/bitstream/handle/REPOSIP/305633/Goncalves%2c%20Fabricio%20Matheus_M.pd?sequence=1&isAllowed=y},
school = {University of Campinas - Unicamp},
title = {Design de interação em sistemas computacionais para apoio à aprendizagem ativa: uma abordagem sistêmica},
year = {2016}
}
In formal learning contexts, there is a diversity of interests and skills than need to have space in the interaction between those involved in the production and sharing of knowledge. Active Learning refers to a set of strategies through which the participation of key actors in the educational environment goes beyond the unidirectional expository/receptive model of knowledge, involving activities where discussion and collaboration with others have an important role in reflection and construction of meaning. Computer systems that support the learning processes have not been designed for the diversity of skills and demands present in such environments. In this thesis we argue that if we want to develop solutions that make sense to stakeholders and suited to the complexity that characterizes Active Learning scenarios, we need to include them in the design cycle. In this paper we adopt a perspective based on Organizational Semiotics for analysis of Active Learning scenarios and propose a systemic vision for the interaction design in such environments, including motivational aspects. Work results include: a cyclical design process than we call 'on the fly design', and a system for collaborative authoring and review of Active Learning activities. This process has been experimented on incremental construction of the system which, in turn, has been experimented in a real context of higher education. The system was evaluated iteratively, based on feedback from stakeholders in the situated context, feeding back the characterization of the process. The process was effective in building an emerging system for supporting the collaboration and participation of those involved in the experimented Active Learning scenarios.
|
Cavoto, Patrícia
ReGraph : Bridging Relational and Graph Databases (mastersthesis)
Universidade Estadual de Campinas - UNICAMP,
mastersthesis,
2016.
(
Abstract |
Links |
BibTeX |
Tags:
Databases, Ontologies (Information retrieval), Software development - Databases
)
@mastersthesis{Cavoto2016,
abstract = {Networks are everywhere. From social interactions: family, friends, hobbies; passing through computer science: computers on the Internet; to nature: as food chains. Recent research shows the importance of links and network analysis to discover knowledge in existing data. Moreover, the Linked Open Data and Semantic Web efforts empowered the fast growth of open knowledge repositories on the web, mainly in the RDF (Resource Description Framework) graph model. However, a lot of data are stored in relational databases, whose model has not been designed to address queries with many transitive relations. On the other hand, the flexible graph model is suitable for data analysis focusing on links, their transitivity and the network topology, e.g., a connected component analysis. Therefore, our research is inspired by the data OLAP (OnLine Analytical Processing) approach of creating a special database designed for data analysis, a network-driven data analysis, using graph databases. In this dissertation, we present ReGraph, a framework to map data from a relational to a graph database, managing a dynamic coexistence and evolution of both, not supported by related work. ReGraph has minimum impact on the existing infrastructure, providing a flexible and tailored graph model for each relational schema. It uses an initial ETL (Extract, Transform and Load) process to replicate the existing data in the graph database. A scheduled service is responsible for automatically reflecting changes in the relational data into the graph, keeping both synchronized. ReGraph also provides an annotation functionality to materialize inferences and to support data enrichment, which enables linking the local database to global knowledge graphs on the Web. We have used the ReGraph framework to generate FishGraph, a graph database created from the FishBase relational database. Using FishGraph we developed experiments to analyze the connections among thousands of identification keys and species, and we have linked local data to DBpedia, creating annotations over the local graph and providing new knowledge from existing data.},
author = {Patrícia Cavoto},
date = {2016-02-04},
keyword = {Databases, Ontologies (Information retrieval), Software development - Databases},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/05/Cavoto2016.pdf},
school = {Universidade Estadual de Campinas - UNICAMP},
title = {ReGraph : Bridging Relational and Graph Databases},
year = {2016}
}
Networks are everywhere. From social interactions: family, friends, hobbies; passing through computer science: computers on the Internet; to nature: as food chains. Recent research shows the importance of links and network analysis to discover knowledge in existing data. Moreover, the Linked Open Data and Semantic Web efforts empowered the fast growth of open knowledge repositories on the web, mainly in the RDF (Resource Description Framework) graph model. However, a lot of data are stored in relational databases, whose model has not been designed to address queries with many transitive relations. On the other hand, the flexible graph model is suitable for data analysis focusing on links, their transitivity and the network topology, e.g., a connected component analysis. Therefore, our research is inspired by the data OLAP (OnLine Analytical Processing) approach of creating a special database designed for data analysis, a network-driven data analysis, using graph databases. In this dissertation, we present ReGraph, a framework to map data from a relational to a graph database, managing a dynamic coexistence and evolution of both, not supported by related work. ReGraph has minimum impact on the existing infrastructure, providing a flexible and tailored graph model for each relational schema. It uses an initial ETL (Extract, Transform and Load) process to replicate the existing data in the graph database. A scheduled service is responsible for automatically reflecting changes in the relational data into the graph, keeping both synchronized. ReGraph also provides an annotation functionality to materialize inferences and to support data enrichment, which enables linking the local database to global knowledge graphs on the Web. We have used the ReGraph framework to generate FishGraph, a graph database created from the FishBase relational database. Using FishGraph we developed experiments to analyze the connections among thousands of identification keys and species, and we have linked local data to DBpedia, creating annotations over the local graph and providing new knowledge from existing data.
|
Daltio, Jaudete;
Medeiros, Claudia Bauzer
A View Handler for Semantic Graphs (conference)
Proceedings 10th IEEE ICSC,
Los Angeles,
2016.
(
Abstract |
Links |
BibTeX |
Tags:
Graph Databases
)
@conference{Daltio2016,
abstract = {Scientific data often come from networks with complex relationships between their entities and can be properly modeled as semantic graphs. However, once designed, there is no simple way to cross through different designs in graph databases. The goal of this research is to specify and implement a framework to overcome these limitations, allowing users to build and explore arbitrary perspectives in graphs. The framework uses the concept of views to represent a perspective. The main contribution is to help scientists run models and analyze network (graph) data according to their specific design needs. The framework is under implementation and validation using a case study on water resource data.},
address = {Los Angeles},
author = {Jaudete Daltio and Claudia Bauzer Medeiros},
date = {2016-02-03},
editor = {10th IEEE International Conference on Semantic Computing},
howpublished = {Proceedings 10th IEEE ICSC},
keyword = {Graph Databases},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/02/PID4045125.pdf},
pages = {1-5},
publisher = {Proceedings 10th IEEE ICSC},
title = {A View Handler for Semantic Graphs},
year = {2016}
}
Scientific data often come from networks with complex relationships between their entities and can be properly modeled as semantic graphs. However, once designed, there is no simple way to cross through different designs in graph databases. The goal of this research is to specify and implement a framework to overcome these limitations, allowing users to build and explore arbitrary perspectives in graphs. The framework uses the concept of views to represent a perspective. The main contribution is to help scientists run models and analyze network (graph) data according to their specific design needs. The framework is under implementation and validation using a case study on water resource data.
|
2015 |
Daltio, Jaudete;
Medeiros, Claudia Bauzer
Hydrograph: Exploring Geographic Data in Graph Databases (conference)
XVI Brazilian Symposium on Geoinformatics (GEOINFO),
Campos do Jordao,
2015.
(
Abstract |
Links |
BibTeX |
Tags:
Graph Databases
)
@conference{Daltio2015,
abstract = {Water becomes, every day, more scarce. Reliable information about volume and quality in each watershed is important to management and proper planning of their use. Data-intensive science is being increasingly needed in this context. Associated analysis processes require handling the drainage network that represents a watershed. This paper presents an ongoing work that explores geographic watershed data using graph databases – a scalable and flexible kind of NoSQL databases. The Brazilian Watershed database is used as a case study. The mapping between geographic and graph models is based on the natural network that emerges from the topological relationships among geographic entities.},
address = {Campos do Jordao},
author = {Jaudete Daltio and Claudia Bauzer Medeiros},
booktitle = {XVI Brazilian Symposium on Geoinformatics (GEOINFO)},
date = {2015-11-30},
keyword = {Graph Databases},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/02/daltio_medeiros_geoinfo.pdf},
pages = {44-55},
title = {Hydrograph: Exploring Geographic Data in Graph Databases},
year = {2015}
}
Water becomes, every day, more scarce. Reliable information about volume and quality in each watershed is important to management and proper planning of their use. Data-intensive science is being increasingly needed in this context. Associated analysis processes require handling the drainage network that represents a watershed. This paper presents an ongoing work that explores geographic watershed data using graph databases – a scalable and flexible kind of NoSQL databases. The Brazilian Watershed database is used as a case study. The mapping between geographic and graph models is based on the natural network that emerges from the topological relationships among geographic entities.
|
Junior, Luiz Celso Gomes
Querying and Managing Complex Networks (phdthesis)
Universidade Estadual de Campinas - UNICAMP,
phdthesis,
2015.
(
Abstract |
Links |
BibTeX |
Tags:
Complex Networks, Databases
)
@phdthesis{Gomes-Jr2015c,
abstract = {Understanding and quantifying the emergent properties of natural and man-made networks such as food webs, social interactions, and transportation infrastructures is a challenging task. The complex networks field was developed to encompass measurements, algorithms, and techniques to tackle such topics. Although complex networks research has been successfully applied to several areas of human activity, there is still a lack of common infrastructures for routine tasks, especially those related to data management. On the other hand, the databases field has focused on mastering data management issues since its beginnings, several decades ago. Database systems, however, offer limited network analysis capabilities. To enable a better support for complex network analysis tasks, a database system must offer adequate querying and data management capabilities. This thesis advocates for a tighter integration between the areas and presents our efforts towards this goal. Here we describe the Complex Data Management System (CDMS), which enables explorative querying of complex networks through a declarative query language. Query results are ranked based on network measurements assessed at query time. To support query processing, we introduce the Beta-algebra, which offers an operator capable of representing diverse measurements typical of complex network analysis. The algebra offers opportunities for transparent query optimization through query rewritings, proposed and discussed here. We also introduce the mapper mechanism for relationship management, which is integrated in the query language. The flexible query language and data management mechanisms are useful in scenarios other than complex network analysis. We demonstrate the use of the CDMS in applications such as institutional data integration, information retrieval, classification and recommendation. All aspects of the proposal are implemented and have been tested with real and synthetic data.},
author = {Luiz Celso Gomes Junior},
date = {2015-10-26},
keyword = {Complex Networks, Databases},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/02/Celso-Jr-Doutorado.pdf},
school = {Universidade Estadual de Campinas - UNICAMP},
title = {Querying and Managing Complex Networks},
year = {2015}
}
Understanding and quantifying the emergent properties of natural and man-made networks such as food webs, social interactions, and transportation infrastructures is a challenging task. The complex networks field was developed to encompass measurements, algorithms, and techniques to tackle such topics. Although complex networks research has been successfully applied to several areas of human activity, there is still a lack of common infrastructures for routine tasks, especially those related to data management. On the other hand, the databases field has focused on mastering data management issues since its beginnings, several decades ago. Database systems, however, offer limited network analysis capabilities. To enable a better support for complex network analysis tasks, a database system must offer adequate querying and data management capabilities. This thesis advocates for a tighter integration between the areas and presents our efforts towards this goal. Here we describe the Complex Data Management System (CDMS), which enables explorative querying of complex networks through a declarative query language. Query results are ranked based on network measurements assessed at query time. To support query processing, we introduce the Beta-algebra, which offers an operator capable of representing diverse measurements typical of complex network analysis. The algebra offers opportunities for transparent query optimization through query rewritings, proposed and discussed here. We also introduce the mapper mechanism for relationship management, which is integrated in the query language. The flexible query language and data management mechanisms are useful in scenarios other than complex network analysis. We demonstrate the use of the CDMS in applications such as institutional data integration, information retrieval, classification and recommendation. All aspects of the proposal are implemented and have been tested with real and synthetic data.
|
Pantoja, Fagner L.;
Reis, Julio Cesar Dos;
Santanchè, André
Semantic Interpretation of Biological Identification Keys (conference)
Proceedings of the Brazilian Symposium on Databases (SBBD), 2015,
2015.
(
Abstract |
Links |
BibTeX |
Tags:
Identification Keys, NER, Semantic Interpretation
)
@conference{Pantoja2015,
abstract = {In biological data, Identification Keys (IKs) are central artifacts used by biologists to identify the taxonomic group of an observed specimen, such as family, order, species, etc. Despite their relevance, IKs are usually defined in a semistructured textual format, which does not favor easily retrieval and deep analysis over their data. This article aims to present a method to formally structure and extract semantic facts from IKs relying on graphs and domain ontologies. The approach explores classical extraction and matching procedures combined with the specific characteristics of IKs. Initial experiments reveal the feasibility of the approach.},
author = {Fagner L. Pantoja and Julio Cesar Dos Reis and André Santanchè},
booktitle = {Proceedings of the Brazilian Symposium on Databases (SBBD), 2015},
date = {2015-10-13},
keyword = {Identification Keys, NER, Semantic Interpretation},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/02/article-2015-08-20comments2.pdf},
title = {Semantic Interpretation of Biological Identification Keys},
year = {2015}
}
In biological data, Identification Keys (IKs) are central artifacts used by biologists to identify the taxonomic group of an observed specimen, such as family, order, species, etc. Despite their relevance, IKs are usually defined in a semistructured textual format, which does not favor easily retrieval and deep analysis over their data. This article aims to present a method to formally structure and extract semantic facts from IKs relying on graphs and domain ontologies. The approach explores classical extraction and matching procedures combined with the specific characteristics of IKs. Initial experiments reveal the feasibility of the approach.
|
Cavoto, Patrícia;
Santanchè, André
ReGraph: Bridging Relational and Graph Databases (conference)
Proceedings of Satellite Events of the 30th Brazilian Symposium on Databases 2015 (SBBD 2015),
Sociedade Brasileira de Computação (SBC),
Petrópolis, RJ,
2316-5170,
2015.
(
Abstract |
Links |
BibTeX |
Tags:
graph database, ReGraph framework, relational database
)
@conference{Cavoto2015c,
abstract = {In this paper, we present ReGraph, a framework to map data from a relational to a graph database, managing a dynamic coexistence and evolution of both, not supported by related work. ReGraph has minimal impact in the existing infrastructure, providing a flexible and tailored graph model for each relational schema. It uses an initial ETL (Extract, Transform and Load) process to replicate the existing data in the graph database. A scheduled service is responsible for reflecting changes in the relational data into the graph, keeping both synchronized. ReGraph also provides an annotation functionality that allows users to add new information in the mapped graph, providing the materialization of inferences and data
enrichment.},
address = {Petrópolis, RJ},
author = {Patrícia Cavoto and André Santanchè},
booktitle = {Proceedings of Satellite Events of the 30th Brazilian Symposium on Databases 2015 (SBBD 2015)},
date = {2015-10-13},
isbn = {2316-5170},
keyword = {graph database, ReGraph framework, relational database},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/02/Cavoto2015c.pdf},
note = {Demo-paper},
pages = {179-184},
publisher = {Sociedade Brasileira de Computação (SBC)},
title = {ReGraph: Bridging Relational and Graph Databases},
year = {2015}
}
In this paper, we present ReGraph, a framework to map data from a relational to a graph database, managing a dynamic coexistence and evolution of both, not supported by related work. ReGraph has minimal impact in the existing infrastructure, providing a flexible and tailored graph model for each relational schema. It uses an initial ETL (Extract, Transform and Load) process to replicate the existing data in the graph database. A scheduled service is responsible for reflecting changes in the relational data into the graph, keeping both synchronized. ReGraph also provides an annotation functionality that allows users to add new information in the mapped graph, providing the materialization of inferences and data enrichment.
|
Mota, Matheus Silva;
Reis, Julio Cesar dos;
Goutte, Sandra;
Santanchè, André
Multiscale Dataspace for Organism-centric Analysis (conference)
Proceedings of the Brazilian Symposium on Databases (SBBD),
2015.
(
Abstract |
Links |
BibTeX |
Tags:
linkedscales
)
@conference{Mota2015b,
abstract = {Biologists increasingly need a unified view to understand and discover relationships among data elements scattered along data sources with different levels of heterogeneity. Existing approaches usually adopt ad-hoc heavyweight integration strategies, requiring a costly upfront effort involving a monolithic chain of steps to handle specific formats/schemas, with low or no reuse. This article proposes an original framework based on scales aligned with the dataspaces on demand integration principle. Scales systematize and encapsulate integration in discrete steps, fulfilling the dynamicity of the process through reuse of previous scales and localized customization. Although the proposed framework can be extended to several scenarios, this work focuses on the biology domain addressing the organism-centric analysis scenario.},
author = {Matheus Silva Mota and Julio Cesar dos Reis and Sandra Goutte and André Santanchè},
booktitle = {Proceedings of the Brazilian Symposium on Databases (SBBD)},
date = {2015-10-01},
keyword = {linkedscales},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/09/paper1.pdf},
title = {Multiscale Dataspace for Organism-centric Analysis},
year = {2015}
}
Biologists increasingly need a unified view to understand and discover relationships among data elements scattered along data sources with different levels of heterogeneity. Existing approaches usually adopt ad-hoc heavyweight integration strategies, requiring a costly upfront effort involving a monolithic chain of steps to handle specific formats/schemas, with low or no reuse. This article proposes an original framework based on scales aligned with the dataspaces on demand integration principle. Scales systematize and encapsulate integration in discrete steps, fulfilling the dynamicity of the process through reuse of previous scales and localized customization. Although the proposed framework can be extended to several scenarios, this work focuses on the biology domain addressing the organism-centric analysis scenario.
|
Borges, Luana Loubet;
Santanchè, André
Unificando a Comparação e Busca de Fenótipos em Model Organism Databases (conference)
Proceedings of 7th Brazilian Conference on Ontological Research (ONTOBRAS 2015),
2015.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Borges2015,
abstract = {Model Organism Databases (MODs) são largamente utilizados em pesquisas nas áreas médica e biológica. Como cada MOD é usualmente especializado em um tipo de organismo e.g., peixe-zebra, rato, humano, camundongo torna-se difícil a busca da mesma característica em organismos distintos para fins de correlação e comparação. Este trabalho apresenta um framework chamado Unified MOD Discovery Engine, cujo objetivo é permitir a correlação e busca de dados de vários MODs, a partir da unificação da sua representação dos dados. Este artigo apresenta o primeiro passo nesta direção, em que foram analisados e comparados os modelos de dados de dois MODs, o ZFIN (peixa-zebra) e MGI (camundongo), como base para a concepção de um modelo unificado. Tal modelo é a base de um grafo interligado, que permitirá ao usuário fazer buscas e comparações de forma unificada.},
author = {Luana Loubet Borges and André Santanchè},
booktitle = {Proceedings of 7th Brazilian Conference on Ontological Research (ONTOBRAS 2015)},
date = {2015-09-10},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/02/ontobras-luana-santanche-2015.pdf},
pages = {1-6},
title = {Unificando a Comparação e Busca de Fenótipos em Model Organism Databases},
year = {2015}
}
Model Organism Databases (MODs) são largamente utilizados em pesquisas nas áreas médica e biológica. Como cada MOD é usualmente especializado em um tipo de organismo e.g., peixe-zebra, rato, humano, camundongo torna-se difícil a busca da mesma característica em organismos distintos para fins de correlação e comparação. Este trabalho apresenta um framework chamado Unified MOD Discovery Engine, cujo objetivo é permitir a correlação e busca de dados de vários MODs, a partir da unificação da sua representação dos dados. Este artigo apresenta o primeiro passo nesta direção, em que foram analisados e comparados os modelos de dados de dois MODs, o ZFIN (peixa-zebra) e MGI (camundongo), como base para a concepção de um modelo unificado. Tal modelo é a base de um grafo interligado, que permitirá ao usuário fazer buscas e comparações de forma unificada.
|
Mota, Matheus Silva;
Santanchè, André
Conceiving a Multiscale Dataspace for Data Analysis (conference)
Proceedings of the Brazilian Seminar on Ontologies (ONTOBRAS 2015),
CEUR,
2015.
(
Abstract |
Links |
BibTeX |
Tags:
dataspace, multscale
)
@conference{Mota2015,
abstract = {A consequence of the intensive growth of information shared online is the increase of opportunities to link and integrate distinct sources of knowledge. This linking and integration can be hampered by different levels of heterogeneity in the available sources. Existing approaches focusing on heavyweight integration – e.g., schema mapping or ontology alignment – require costly upfront efforts to handle specific formats/schemas. In this scenario, dataspaces emerge as a modern alternative approach to address the integration of heterogeneous sources. The classic heavyweight upfront one-step integration is replaced by an incremental integration, starting from lightweight connections, tightening and improving them when benefits worth such effort. Based on several previous work on data integration for data analysis, this work discusses the conception of a multiscale-based dataspace architecture, called LinkedScales. It departs from the notion of integration-scales within a dataspace, and defines a systematic and progressive integration process via graph-based transformations over a graph database. LinkedScales aims to provide a homogeneous view of heterogeneous sources, allowing systems to reach and produce different integration levels on demand, going from raw representations (lower scales) towards ontology-like structures (higher scales).},
author = {Matheus Silva Mota and André Santanchè},
booktitle = {Proceedings of the Brazilian Seminar on Ontologies (ONTOBRAS 2015)},
date = {2015-09-08},
issn = {16130073},
keyword = {dataspace, multscale},
link = {http://www.ime.usp.br/~ontobras/wp-content/uploads/2015/09/Conceiving-a-Multiscale-Dataspace-for-Data-Analysis.pdf
http://www.lis.ic.unicamp.br/?attachment_id=690
http://ceur-ws.org/Vol-1442/paper_21.pdf},
pages = {12},
publisher = {CEUR},
title = {Conceiving a Multiscale Dataspace for Data Analysis},
volume = {1442},
year = {2015}
}
A consequence of the intensive growth of information shared online is the increase of opportunities to link and integrate distinct sources of knowledge. This linking and integration can be hampered by different levels of heterogeneity in the available sources. Existing approaches focusing on heavyweight integration – e.g., schema mapping or ontology alignment – require costly upfront efforts to handle specific formats/schemas. In this scenario, dataspaces emerge as a modern alternative approach to address the integration of heterogeneous sources. The classic heavyweight upfront one-step integration is replaced by an incremental integration, starting from lightweight connections, tightening and improving them when benefits worth such effort. Based on several previous work on data integration for data analysis, this work discusses the conception of a multiscale-based dataspace architecture, called LinkedScales. It departs from the notion of integration-scales within a dataspace, and defines a systematic and progressive integration process via graph-based transformations over a graph database. LinkedScales aims to provide a homogeneous view of heterogeneous sources, allowing systems to reach and produce different integration levels on demand, going from raw representations (lower scales) towards ontology-like structures (higher scales).
|
Cavoto, Patrícia;
Santanchè, André
Annotation-Based Method for Linking Local and Global Knowledge Graphs (conference)
Proceedings of the Brazilian Seminar on Ontologies (ONTOBRAS 2015),
2015.
(
Abstract |
Links |
BibTeX |
Tags:
annotation-based method, graph database, ontologies, ReGraph framework
)
@conference{Cavoto2015b,
abstract = {In the last years, the use of data available in “global graphs” as Linked Open Data and Ontologies are increasing faster and bringing with them the popularization of the graph structure to represent information networks. One challenge, in this context, is how to link local and global knowledge graphs. This paper presents an approach to address this problem through an annotation-based method to link a local graph database to global graphs. Different from related work, the local graph is not derived from a static dataset, but it is a dynamic graph database evolving along the time, containing connections (annotations) with global graphs that must stay consistent during its evolution. We applied this method over a dataset with more than 44,500 nodes, annotating them with the values found in DBpedia and GeoNames. The proposed method is an extension of our ReGraph framework that bridges relational and graph databases, keeping both integrated, synchronized and in their native representations, with minimal impact in the current infrastructure.},
author = {Patrícia Cavoto and André Santanchè},
booktitle = {Proceedings of the Brazilian Seminar on Ontologies (ONTOBRAS 2015)},
date = {2015-09-08},
issn = {16130073},
keyword = {annotation-based method, graph database, ontologies, ReGraph framework},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/02/Cavoto2015b.pdf},
note = {Short-paper},
pages = {1-6},
title = {Annotation-Based Method for Linking Local and Global Knowledge Graphs},
volume = {1442 CEUR},
year = {2015}
}
In the last years, the use of data available in “global graphs” as Linked Open Data and Ontologies are increasing faster and bringing with them the popularization of the graph structure to represent information networks. One challenge, in this context, is how to link local and global knowledge graphs. This paper presents an approach to address this problem through an annotation-based method to link a local graph database to global graphs. Different from related work, the local graph is not derived from a static dataset, but it is a dynamic graph database evolving along the time, containing connections (annotations) with global graphs that must stay consistent during its evolution. We applied this method over a dataset with more than 44,500 nodes, annotating them with the values found in DBpedia and GeoNames. The proposed method is an extension of our ReGraph framework that bridges relational and graph databases, keeping both integrated, synchronized and in their native representations, with minimal impact in the current infrastructure.
|
Cavoto, Patrícia;
Cardo, Victor;
Lebbe, Régine Vignes;
Santanchè, André
FishGraph: A Network-Driven Data Analysis (conference)
2015 IEEE 11th International Conference on e-Science (e-Science 2015),
IEEE,
Munich, Germany,
2015.
(
Abstract |
Links |
BibTeX |
Tags:
Biodiversity Information Systems, graph database, network topology analysis
)
@conference{Cavoto2015,
abstract = {There are a lot of data about biodiversity stored in different database models and most of them are relational. Recent research shows the importance of links and network analysis to discover knowledge in existing data. However, the relational model was not designed to address problems in which the links between data have the same importance as the data -- a common scenario in the biodiversity area. Moreover, the Linked Data and Semantic Web efforts empowered the fast growth of open knowledge repositories on the web, mainly in the RDF (Resource Description Framework) graph model. The flexible graph database model contrasts with the rigid relational model and is also suitable for data analysis focusing on links and the network topology, e.g., a connected component analysis. Our research is inspired by the data OLAP (OnLine Analytical Processing) approach of creating a special database designed for data analysis, a network-driven data analysis using graph databases, in our case. Beyond an initial ETL (Extract, Transform and Load) approach, we are facing the challenge of migrating the data from the relational to the graph database, managing a dynamic coexistence and evolution of both, not supported by related work. This work is motivated by a joint research involving network-driven data analysis over the FishBase global information system. We present a novel approach to analyzing the connections among thousands of identification keys and species and to linking local data to third party knowledge bases on the web.},
address = {Munich, Germany},
author = {Patrícia Cavoto and Victor Cardo and Régine Vignes Lebbe and André Santanchè},
booktitle = {2015 IEEE 11th International Conference on e-Science (e-Science 2015)},
date = {2015-08-31},
keyword = {Biodiversity Information Systems, graph database, network topology analysis},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/02/Cavoto2015.pdf},
pages = {177 - 186},
publisher = {IEEE},
title = {FishGraph: A Network-Driven Data Analysis},
year = {2015}
}
There are a lot of data about biodiversity stored in different database models and most of them are relational. Recent research shows the importance of links and network analysis to discover knowledge in existing data. However, the relational model was not designed to address problems in which the links between data have the same importance as the data -- a common scenario in the biodiversity area. Moreover, the Linked Data and Semantic Web efforts empowered the fast growth of open knowledge repositories on the web, mainly in the RDF (Resource Description Framework) graph model. The flexible graph database model contrasts with the rigid relational model and is also suitable for data analysis focusing on links and the network topology, e.g., a connected component analysis. Our research is inspired by the data OLAP (OnLine Analytical Processing) approach of creating a special database designed for data analysis, a network-driven data analysis using graph databases, in our case. Beyond an initial ETL (Extract, Transform and Load) approach, we are facing the challenge of migrating the data from the relational to the graph database, managing a dynamic coexistence and evolution of both, not supported by related work. This work is motivated by a joint research involving network-driven data analysis over the FishBase global information system. We present a novel approach to analyzing the connections among thousands of identification keys and species and to linking local data to third party knowledge bases on the web.
|
Bernardo, Ivelize Rocha;
Borges, Michela;
Baranauskas, Maria Cecília Calani;
Santanchè, André
Interpretation of Construction Patterns for Biodiversity Spreadsheets (article)
Enterprise Information Systems,
2015.
(
Abstract |
Links |
BibTeX |
Tags:
Biodiversity data integration, Pattern recognition, Semantic mapping, Spreadsheet interpretation
)
@article{Bernardo2015,
abstract = {Spreadsheets are widely adopted as “popular databases”, where authors shape their solutions interactively. Although spreadsheets are easily adaptable by the author, their informal schemas cannot be automatically interpreted by machines to integrate data across independent spreadsheets. In biology, we observed a significant amount of biodiversity data in spreadsheets treated as isolated entities with different tabular organizations, but with high potential for data articulation. In order to automatically interpret these spreadsheets we exploit construction patterns followed by users in the biodiversity domain. This paper details evidences of such patterns and how they can lead to characterize the nature of a spreadsheet, as well as, its fields in a domain. It combines an automatic analysis of thousands of spreadsheets, collected on the Web, with results from a survey conducted with biologists. We propose a representation model to be used in automatic interpretation systems that captures these patterns.},
author = {Ivelize Rocha Bernardo and Michela Borges and Maria Cecília Calani Baranauskas and André Santanchè},
date = {2015-07-31},
journal = {Enterprise Information Systems},
keyword = {Biodiversity data integration, Pattern recognition, Semantic mapping, Spreadsheet interpretation},
link = {http://link.springer.com/chapter/10.1007/978-3-319-22348-3_22},
pages = {397-414},
title = {Interpretation of Construction Patterns for Biodiversity Spreadsheets},
volume = {227},
year = {2015}
}
Spreadsheets are widely adopted as “popular databases”, where authors shape their solutions interactively. Although spreadsheets are easily adaptable by the author, their informal schemas cannot be automatically interpreted by machines to integrate data across independent spreadsheets. In biology, we observed a significant amount of biodiversity data in spreadsheets treated as isolated entities with different tabular organizations, but with high potential for data articulation. In order to automatically interpret these spreadsheets we exploit construction patterns followed by users in the biodiversity domain. This paper details evidences of such patterns and how they can lead to characterize the nature of a spreadsheet, as well as, its fields in a domain. It combines an automatic analysis of thousands of spreadsheets, collected on the Web, with results from a survey conducted with biologists. We propose a representation model to be used in automatic interpretation systems that captures these patterns.
|
Batista, Lucas Oliveira
Apoio ao Estudo de Correlações entre Séries Temporais baseadas em Anotações Semânticas (mastersthesis)
Universidade Estadual de Campinas - UNICAMP,
mastersthesis,
2015.
(
Abstract |
Links |
BibTeX |
Tags:
Semantic Annotation, Time Series Search, Time Series Semantic Annotation Model
)
@mastersthesis{Batista2015,
abstract = {Séries temporais são utilizadas em diversos domínios do conhecimento, por exemplo, economia, meteorologia e agricultura. Em várias situações, cientistas, muitas vezes, associam anotações a séries durante sua análise. Além disso, precisam buscar e correlacionar vários tipos de séries para estudar algum problema. Isto é dificultado não só pela heterogeneidade entre as séries, como também pela limitação dos mecanismos de busca por séries relevantes a uma correlação. As modalidades predominantes na busca por séries são baseadas ou em casamento de texto (anotações) ou em casamento de padrões. Não permitem buscas por séries que estejam relacionadas semanticamente. Diante deste cenário, esta dissertação propõe o TS³Annotation, um framework que usa anotações semânticas como base para permitir o estudo de correlações entre séries. As principais contribuições desta dissertação são: (1) um modelo de anotação semântica para séries temporais; (2) e o framework TS³Annotation que permite a especialistas anotar semanticamente séries, além de explorar o uso destas anotações como uma nova possibilidade na busca por séries temporais.},
author = {Lucas Oliveira Batista},
date = {2015-07-07},
keyword = {Semantic Annotation, Time Series Search, Time Series Semantic Annotation Model},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/08/LucasBatista_DissertacaoFinal.pdf},
school = {Universidade Estadual de Campinas - UNICAMP},
title = {Apoio ao Estudo de Correlações entre Séries Temporais baseadas em Anotações Semânticas},
year = {2015}
}
Séries temporais são utilizadas em diversos domínios do conhecimento, por exemplo, economia, meteorologia e agricultura. Em várias situações, cientistas, muitas vezes, associam anotações a séries durante sua análise. Além disso, precisam buscar e correlacionar vários tipos de séries para estudar algum problema. Isto é dificultado não só pela heterogeneidade entre as séries, como também pela limitação dos mecanismos de busca por séries relevantes a uma correlação. As modalidades predominantes na busca por séries são baseadas ou em casamento de texto (anotações) ou em casamento de padrões. Não permitem buscas por séries que estejam relacionadas semanticamente. Diante deste cenário, esta dissertação propõe o TS³Annotation, um framework que usa anotações semânticas como base para permitir o estudo de correlações entre séries. As principais contribuições desta dissertação são: (1) um modelo de anotação semântica para séries temporais; (2) e o framework TS³Annotation que permite a especialistas anotar semanticamente séries, além de explorar o uso destas anotações como uma nova possibilidade na busca por séries temporais.
|
Santo, Jacqueline Midlej do Espírito
Especificação e Detecção de Padrões Complexos de Variáveis Ambientais em Aplicações de Biodiversidade (mastersthesis)
Universidade Estadual de Campinas - UNICAMP,
mastersthesis,
2015.
(
Abstract |
Links |
BibTeX |
Tags:
Complex Event Processing, Pattern Detection, Pattern Specification
)
@mastersthesis{santo2015,
abstract = {Aplicações de biodiversidade se caracterizam por necessitar de uma grande variedade de dados ambientais em múltiplas escalas. Este contexto envolve uma enorme quantidade de dados gerados por fontes heterogêneas, sendo o fluxo de dados de sensores uma das principais fontes. Um problema em aberto neste contexto é como especificar e detectar cenários de interesse a partir de variáveis ambientais em múltiplas escalas, para facilitar aos cientistas a análise de fenômenos e correlações com dados coletados em campo. Para ajudar a solucionar o problema, a dissertação se baseia na teoria de Processamento de Eventos Complexos para permitir a especificação de cenários através de padrões e a detecção da ocorrência do cenário em tempo real. Nesta literatura, dados são tratados como eventos e padrões são descritos pelas especificações de eventos e seus relacionamentos. Linguagens de eventos, no entanto, não consideram aspectos espaciais (necessários em biodiversidade) e a composição de eventos é limitada. Tendo em vista esse contexto, a dissertação propõe uma linguagem baseada em lógica para que cientistas especifiquem cenários de interesse. Esses cenários são baseados em composição de eventos complexos.
As principais contribuições da dissertação são: proposta da arquitetura de um framework para detecção de eventos complexos, que estende o trabalho de Koga 2013; um modelo de dados para representar eventos em biodiversidade; e uma linguagem para descrever padrões de forma hierárquica, explorando o relacionamento espacial e temporal entre os eventos em diferentes níveis de abstração.},
author = {Jacqueline Midlej do Espírito Santo},
date = {2015-07-06},
keyword = {Complex Event Processing, Pattern Detection, Pattern Specification},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/07/JacquelineMidlej-Dissertacao.pdf},
school = {Universidade Estadual de Campinas - UNICAMP},
title = {Especificação e Detecção de Padrões Complexos de Variáveis Ambientais em Aplicações de Biodiversidade},
year = {2015}
}
Aplicações de biodiversidade se caracterizam por necessitar de uma grande variedade de dados ambientais em múltiplas escalas. Este contexto envolve uma enorme quantidade de dados gerados por fontes heterogêneas, sendo o fluxo de dados de sensores uma das principais fontes. Um problema em aberto neste contexto é como especificar e detectar cenários de interesse a partir de variáveis ambientais em múltiplas escalas, para facilitar aos cientistas a análise de fenômenos e correlações com dados coletados em campo. Para ajudar a solucionar o problema, a dissertação se baseia na teoria de Processamento de Eventos Complexos para permitir a especificação de cenários através de padrões e a detecção da ocorrência do cenário em tempo real. Nesta literatura, dados são tratados como eventos e padrões são descritos pelas especificações de eventos e seus relacionamentos. Linguagens de eventos, no entanto, não consideram aspectos espaciais (necessários em biodiversidade) e a composição de eventos é limitada. Tendo em vista esse contexto, a dissertação propõe uma linguagem baseada em lógica para que cientistas especifiquem cenários de interesse. Esses cenários são baseados em composição de eventos complexos. As principais contribuições da dissertação são: proposta da arquitetura de um framework para detecção de eventos complexos, que estende o trabalho de Koga 2013; um modelo de dados para representar eventos em biodiversidade; e uma linguagem para descrever padrões de forma hierárquica, explorando o relacionamento espacial e temporal entre os eventos em diferentes níveis de abstração.
|
Beserra, Renato
Quality Flow: a collaborative quality-aware platform for experiments in eScience (mastersthesis)
Universidade Estadual de Campinas - UNICAMP,
mastersthesis,
2015.
(
Abstract |
Links |
BibTeX |
Tags:
Data quality
)
@mastersthesis{Beserra2015,
abstract = {Many scientific research procedures rely upon the analysis of data obtained from heterogeneous sources. The validity of the research results depends, among others, on the quality of data. Data quality is a topic that has pervaded computer science research for decades. Though there are many proposals for data quality assessment, there are still open problems such as mechanisms to support flexible quality assessment and ways to derive data quality. The goal of this dissertation is to work on these issues. The main contribution of this dissertation is the proposal of QualityFlow: a quality-aware collaborative platform for experiments in eScience. The following contributions were accomplished: to support the creation of quality-aware scientific workflows, allowing the addition of quality attributes to workflows, while at the same time letting distinct users define their specific quality metrics for the same workflow; to allow users to keep track of different quality assessments for a given process, thereby providing insights into the actual value of data and workflow; and to allow scientists to customize data quality dimensions and quality metrics collaboratively. QualityFlow was developed as a web prototype, and executed in two experiments - one based upon a real problem and the other on a sample workflow.},
author = {Renato Beserra},
date = {2015-06-12},
keyword = {Data quality},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/02/TeseRenatoBeserra.pdf},
school = {Universidade Estadual de Campinas - UNICAMP},
title = {Quality Flow: a collaborative quality-aware platform for experiments in eScience},
year = {2015}
}
Many scientific research procedures rely upon the analysis of data obtained from heterogeneous sources. The validity of the research results depends, among others, on the quality of data. Data quality is a topic that has pervaded computer science research for decades. Though there are many proposals for data quality assessment, there are still open problems such as mechanisms to support flexible quality assessment and ways to derive data quality. The goal of this dissertation is to work on these issues. The main contribution of this dissertation is the proposal of QualityFlow: a quality-aware collaborative platform for experiments in eScience. The following contributions were accomplished: to support the creation of quality-aware scientific workflows, allowing the addition of quality attributes to workflows, while at the same time letting distinct users define their specific quality metrics for the same workflow; to allow users to keep track of different quality assessments for a given process, thereby providing insights into the actual value of data and workflow; and to allow scientists to customize data quality dimensions and quality metrics collaboratively. QualityFlow was developed as a web prototype, and executed in two experiments - one based upon a real problem and the other on a sample workflow.
|
Creus-Tomàs, Jordi;
Faria, Fabio Augusto;
Esquerdo, Júlio César Dalla Mora;
Coutinho, Alexandre Camargo;
Medeiros, Claudia Bauzer
SiRCub -- Brazilian Agricultural Crop Recognition System (conference)
XVII Simpósio Brasileiro de Sensoriamento Remoto,
2015.
(
Abstract |
Links |
BibTeX |
Tags:
Conference, crop classification, LULC, NDVI, séries temporais, SVM, time series, Timesat
)
@conference{Creus-Tomas2015,
abstract = {This paper presents a novel approach to classify agricultural crops using NDVI time series. The novelty lies in i) extracting a set of features from the each and every NDVI curve, and ii) using them to train a crop classification model using a Support Vector Machine (SVM). Specifically, we use the TIMESAT program package to: 1) smooth the time series, 2) decompose them into agricultural seasons–a season is the period between sowing and harvesting–, and 3) extract the features for each season. The 11 crop features we extract include the start and end of season, its amplitude, and the curve gradients of the sprouting and senescence periods, among others. Once we have the collection of features, they are fed into an SVM system –we use the LIBSVM library–, together with a collection of annotations about the land use of the corresponding time series. These annotations represent the type of crop for a given location and agricultural season, and they are provided by the specialists of the Embrapa. As a result we obtain a classification model that allows for identifying different crop classes. Our methodology is generic and can be applied to a variety of regions and crop types. We have develop a system called SIRCUB (Sistema de Reconhecimento de Culturas brasileiro), which implements such methodology. Thus, we describe in this paper the architecture of the system and the crop model learning methodology.},
author = {Jordi Creus-Tomàs and Fabio Augusto Faria and Júlio César Dalla Mora Esquerdo and Alexandre Camargo Coutinho and Claudia Bauzer Medeiros},
booktitle = {XVII Simpósio Brasileiro de Sensoriamento Remoto},
date = {2015-04-25},
keyword = {Conference, crop classification, LULC, NDVI, séries temporais, SVM, time series, Timesat},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/04/sircub_sbsr1.pdf},
title = {SiRCub -- Brazilian Agricultural Crop Recognition System},
year = {2015}
}
This paper presents a novel approach to classify agricultural crops using NDVI time series. The novelty lies in i) extracting a set of features from the each and every NDVI curve, and ii) using them to train a crop classification model using a Support Vector Machine (SVM). Specifically, we use the TIMESAT program package to: 1) smooth the time series, 2) decompose them into agricultural seasons–a season is the period between sowing and harvesting–, and 3) extract the features for each season. The 11 crop features we extract include the start and end of season, its amplitude, and the curve gradients of the sprouting and senescence periods, among others. Once we have the collection of features, they are fed into an SVM system –we use the LIBSVM library–, together with a collection of annotations about the land use of the corresponding time series. These annotations represent the type of crop for a given location and agricultural season, and they are provided by the specialists of the Embrapa. As a result we obtain a classification model that allows for identifying different crop classes. Our methodology is generic and can be applied to a variety of regions and crop types. We have develop a system called SIRCUB (Sistema de Reconhecimento de Culturas brasileiro), which implements such methodology. Thus, we describe in this paper the architecture of the system and the crop model learning methodology.
|
Gomes-Jr, Luiz;
Amann, Bernd;
Santanche, André
Beta-Algebra: Towards a Relational Algebra for Graph Analysis (conference)
Workshop Proceedings of the EDBT/ICDT 2015 Joint Conference,
GraphQ/EDBT 2015,
2015.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Gomes-Jr2015b,
abstract = {Graph analysis is an essential tool to understand natural and man-made networks, such as social networks, food webs, transportation infrastructures, etc. Although graph analysis has fomented the development of algorithms, visual tools, and distributed processing frameworks, there is still little support for analysis at the query language level. Current graph query languages are mostly concerned with flexible matching of subgraphs, while graph processing frameworks are mostly concerned with fast parallel execution of instructions. Our goal is to provide analysis capabilities at the language level, allowing more interactive and explorative query-based analysis. In this paper, we present our ongoing efforts towards a relational algebra extension that offers an operator for graph-based data aggregation. The beta (β) operator is composed of four suboperators, which are used to control the path-based aggregations. The β-algebra allows seamless composition of queries that mix relational and graph-based aspects. Here we introduce our current algebra and provide examples of its use. We also show how we are using the analysis strategy in query scenarios. Since the algebra-based query scenario allows for execution plan rewritings, we also discuss our first efforts on equivalence rules for query optimization.},
author = {Luiz Gomes-Jr and Bernd Amann and André Santanche},
booktitle = {Workshop Proceedings of the EDBT/ICDT 2015 Joint Conference},
date = {2015-03-18},
journal = {GraphQ/EDBT 2015},
keyword = {Conference},
link = {http://ceur-ws.org/Vol-1330/paper-26.pdf},
title = {Beta-Algebra: Towards a Relational Algebra for Graph Analysis},
year = {2015}
}
Graph analysis is an essential tool to understand natural and man-made networks, such as social networks, food webs, transportation infrastructures, etc. Although graph analysis has fomented the development of algorithms, visual tools, and distributed processing frameworks, there is still little support for analysis at the query language level. Current graph query languages are mostly concerned with flexible matching of subgraphs, while graph processing frameworks are mostly concerned with fast parallel execution of instructions. Our goal is to provide analysis capabilities at the language level, allowing more interactive and explorative query-based analysis. In this paper, we present our ongoing efforts towards a relational algebra extension that offers an operator for graph-based data aggregation. The beta (β) operator is composed of four suboperators, which are used to control the path-based aggregations. The β-algebra allows seamless composition of queries that mix relational and graph-based aspects. Here we introduce our current algebra and provide examples of its use. We also show how we are using the analysis strategy in query scenarios. Since the algebra-based query scenario allows for execution plan rewritings, we also discuss our first efforts on equivalence rules for query optimization.
|
Gomes-Jr., Luiz;
Santanche, André
The Web Within: Leveraging Web Standards and Graph Analysis to Enable Application-Level Integration of Institutional Data (article)
Transactions on Large-Scale Data- and Knowledge-Centered Systems XIX,
2015.
(
Abstract |
Links |
BibTeX |
Tags:
Journal Paper
)
@article{Gomes-Jr.2015,
abstract = {The expansion of the Web and of our capacity of producing and storing information have had a profound impact on the way we organize, manipulate and share data. We have seen an increased specialization of database back-ends and data models to respond to modern application needs: text indexing engines organize unstructured data, standards and models were created to support the Semantic Web, Big Data requirements stimulated an explosion of data representation and manipulation models. This complex and heterogeneous environment demands unified strategies that enable data integration and, especially, cross-application, expressive querying. Here we present a new approach for the integration of structured and unstructured data within organizations. Our solution is based on the Complex Data Management System (CDMS), a system being developed to handle data typical of complex networks. The CDMS enables a relationship-centric interaction with data that brings many advantages to the institutional data integration scenario, allowing applications to rely on common models for data querying and manipulation. In our framework, diverse data models are integrated in a unifying RDF graph. A novel query model allows the combination of concepts from information retrieval, databases, and complex networks into a declarative query language that extends SPARQL. This query language enables flexible correlation queries over the unified data, enabling support for a wide range of applications such as CMSs, recommendation systems, social networks, etc. We also introduce Mappers, a data management mechanism that simplifies the integration of heterogeneous data and that is integrated in the query language for further flexibility. Experimental results from real data demonstrate the viability of our approach.},
author = {Luiz Gomes-Jr. and André Santanche},
date = {2015-02-24},
journal = {Transactions on Large-Scale Data- and Knowledge-Centered Systems XIX},
keyword = {Journal Paper},
link = {http://link.springer.com/chapter/10.1007%2F978-3-662-46562-2_2},
pages = {26-54},
title = {The Web Within: Leveraging Web Standards and Graph Analysis to Enable Application-Level Integration of Institutional Data},
volume = {8990},
year = {2015}
}
The expansion of the Web and of our capacity of producing and storing information have had a profound impact on the way we organize, manipulate and share data. We have seen an increased specialization of database back-ends and data models to respond to modern application needs: text indexing engines organize unstructured data, standards and models were created to support the Semantic Web, Big Data requirements stimulated an explosion of data representation and manipulation models. This complex and heterogeneous environment demands unified strategies that enable data integration and, especially, cross-application, expressive querying. Here we present a new approach for the integration of structured and unstructured data within organizations. Our solution is based on the Complex Data Management System (CDMS), a system being developed to handle data typical of complex networks. The CDMS enables a relationship-centric interaction with data that brings many advantages to the institutional data integration scenario, allowing applications to rely on common models for data querying and manipulation. In our framework, diverse data models are integrated in a unifying RDF graph. A novel query model allows the combination of concepts from information retrieval, databases, and complex networks into a declarative query language that extends SPARQL. This query language enables flexible correlation queries over the unified data, enabling support for a wide range of applications such as CMSs, recommendation systems, social networks, etc. We also introduce Mappers, a data management mechanism that simplifies the integration of heterogeneous data and that is integrated in the query language for further flexibility. Experimental results from real data demonstrate the viability of our approach.
|
2014 |
Santo, Jacqueline Midlej do Espírito;
Medeiros, Claudia Bauzer
Complex Pattern Detection and Specification to Support Biodiversity Applications (conference)
Proc of SBBD 2014 - WTDBD,
Sociedade Brasileira de Computação (SBC),
Curitiba -PR,
2014.
(
Abstract |
Links |
BibTeX |
Tags:
Complex Event Processing, Conference, Multiscale, Pattern Specification
)
@conference{SantoSBBD2014,
abstract = {Biodiversity scientists often need to define and detect scenarios of interest from data streams delivered from meteorological sensors. For example, scenarios such deforestation or forest fire need to be detected in order to reduce impacts over the environment. Such data streams are characterized by their heterogeneity across spatial and temporal scales, which hampers detection of events and construction of scenarios. To help scientists in this task, this work proposes the use of the theory of Complex Event Processing (CEP) to define and detect complex event patterns in this context. The two main contributions focus on the specification of events and patterns for the biodiversity context and on the mechanism to detect these patterns. The first one requires to extend an Event Processing Language (EPL) to include spatial relationships in the pattern. The second one will extend Koga’s framework [Koga 2013], which integrates heterogeneous data sources, with the detection of complex patterns. This paper extends the short paper accepted for the Brazilian Workshop e-Science (BreSci) 2014 with the specification for events and patterns.},
address = {Curitiba -PR},
author = {Jacqueline Midlej do Espírito Santo and Claudia Bauzer Medeiros},
booktitle = {Proc of SBBD 2014 - WTDBD},
date = {2014-10-08},
editor = {Mirela Moto et al},
issn = {2316-5170},
keyword = {Complex Event Processing, Conference, Multiscale, Pattern Specification},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/04/wtd_sbbd_v4-2.pdf},
note = {Short-paper},
pages = {288-294},
publisher = {Sociedade Brasileira de Computação (SBC)},
title = {Complex Pattern Detection and Specification to Support Biodiversity Applications},
year = {2014}
}
Biodiversity scientists often need to define and detect scenarios of interest from data streams delivered from meteorological sensors. For example, scenarios such deforestation or forest fire need to be detected in order to reduce impacts over the environment. Such data streams are characterized by their heterogeneity across spatial and temporal scales, which hampers detection of events and construction of scenarios. To help scientists in this task, this work proposes the use of the theory of Complex Event Processing (CEP) to define and detect complex event patterns in this context. The two main contributions focus on the specification of events and patterns for the biodiversity context and on the mechanism to detect these patterns. The first one requires to extend an Event Processing Language (EPL) to include spatial relationships in the pattern. The second one will extend Koga’s framework [Koga 2013], which integrates heterogeneous data sources, with the detection of complex patterns. This paper extends the short paper accepted for the Brazilian Workshop e-Science (BreSci) 2014 with the specification for events and patterns.
|
Batista, Lucas Oliveira;
Medeiros, Claudia Bauzer
Searching Time Series via Semantic Annotations (conference)
Proc. SBBD 2014 - XIII Workshop de Teses e Dissertações em Banco de Dados (WTDBD),
Sociedade Brasileira de Computação (SBC),
Curitiba - PR, Brasil,
2014.
(
Abstract |
Links |
BibTeX |
Tags:
Semantic Annotation, Time Series Search, Time Series Semantic Annotation Model
)
@conference{BatistaMedeiros2014b,
abstract = {Time series are used in several domains of knowledge. During their analysis, experts often create or analyze associations between time series and annotations. In order to study a problem, for example, patient behavior or crop patterns, experts need to search and correlate several time series. However, finding appropriate series related with a problem is a difficult task. Search is usually performed using a few parameters, such as series geographic location. Annotations may be used to help the search using string match. Given this scenario, this paper discusses a work in progress to design and partially develop a software framework to search time series via semantic annotations. It will support experts in the correlation of time series, foster collaboration among experts, and allow the use of Linked Data concepts to aggregate knowledge to content. This paper extends the short paper accepted for BRESCI - Brazilian Workshop e-Science 2014. The extensions include a time series semantic annotation model, implementation details, and a longer theoretical related work section.},
address = {Curitiba - PR, Brasil},
author = {Lucas Oliveira Batista and Claudia Bauzer Medeiros},
booktitle = {Proc. SBBD 2014 - XIII Workshop de Teses e Dissertações em Banco de Dados (WTDBD)},
date = {2014-10-08},
issn = {2316-5170},
keyword = {Semantic Annotation, Time Series Search, Time Series Semantic Annotation Model},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/04/LucasBatista_WTDBDFInal1.pdf},
note = {Short-paper},
pages = {339 - 345},
publisher = {Sociedade Brasileira de Computação (SBC)},
title = {Searching Time Series via Semantic Annotations},
year = {2014}
}
Time series are used in several domains of knowledge. During their analysis, experts often create or analyze associations between time series and annotations. In order to study a problem, for example, patient behavior or crop patterns, experts need to search and correlate several time series. However, finding appropriate series related with a problem is a difficult task. Search is usually performed using a few parameters, such as series geographic location. Annotations may be used to help the search using string match. Given this scenario, this paper discusses a work in progress to design and partially develop a software framework to search time series via semantic annotations. It will support experts in the correlation of time series, foster collaboration among experts, and allow the use of Linked Data concepts to aggregate knowledge to content. This paper extends the short paper accepted for BRESCI - Brazilian Workshop e-Science 2014. The extensions include a time series semantic annotation model, implementation details, and a longer theoretical related work section.
|
Cavoto, Patrícia;
Santanchè, André
Arquitetura Híbrida de Integração entre Banco de Dados Relacional e de Grafos (conference)
Proc. SBBD 2014 - XIII Workshop de Teses e Dissertações em Banco de Dados (WTDBD),
Sociedade Brasileira de Computação (SBC),
Curitiba - PR, Brasil,
Proceedings of SBBD 2014 - WTDBD, Sociedade Brasileira de Computação (SBC), Curitiba -PR, 2014, ISSN: 2316-5170, (Short-paper).,
2014.
(
Abstract |
Links |
BibTeX |
Tags:
banco de dados híbrido, integração de bases, modelo de grafos, modelo relacional
)
@conference{Cavoto2014,
abstract = {A complexidade e o volume dos relacionamentos entre as informações, bem como a necessidade de manter e integrar dados de estruturas heterogêneas aumentam exponencialmente a cada dia. Isto é particularmente importante no contexto de eScience, especialmente biodiversidade, área de interesse deste projeto – em que as relações são fundamentais nas análises. Neste contexto, o modelo de banco de dados de grafos pode apresentar-se como uma abordagem mais apropriada e eficiente no gerenciamento e recuperação destas informações. Em contrapartida, há um grande legado de sistemas que utilizam bancos de dados relacionais, que cumprem um papel fundamental em diversas tarefas. Apresentamos então neste trabalho uma proposta de arquitetura híbrida de integração que permite a convivência dos modelos relacional e de grafos em sua forma nativa, reduzindo o impacto de adaptações em bases relacionais preexistentes e explorando as vantagens de cada modelo nativo nas operações de gerenciamento e recuperação.},
address = {Curitiba - PR, Brasil},
author = {Patrícia Cavoto and André Santanchè},
booktitle = {Proc. SBBD 2014 - XIII Workshop de Teses e Dissertações em Banco de Dados (WTDBD)},
date = {2014-10-06},
issn = {2316-5170},
journal = {Proceedings of SBBD 2014 - WTDBD, Sociedade Brasileira de Computação (SBC), Curitiba -PR, 2014, ISSN: 2316-5170, (Short-paper).},
keyword = {banco de dados híbrido, integração de bases, modelo de grafos, modelo relacional},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2016/02/Cavoto2014.pdf},
note = {Short-paper},
pages = {274-280},
publisher = {Sociedade Brasileira de Computação (SBC)},
title = {Arquitetura Híbrida de Integração entre Banco de Dados Relacional e de Grafos},
year = {2014}
}
A complexidade e o volume dos relacionamentos entre as informações, bem como a necessidade de manter e integrar dados de estruturas heterogêneas aumentam exponencialmente a cada dia. Isto é particularmente importante no contexto de eScience, especialmente biodiversidade, área de interesse deste projeto – em que as relações são fundamentais nas análises. Neste contexto, o modelo de banco de dados de grafos pode apresentar-se como uma abordagem mais apropriada e eficiente no gerenciamento e recuperação destas informações. Em contrapartida, há um grande legado de sistemas que utilizam bancos de dados relacionais, que cumprem um papel fundamental em diversas tarefas. Apresentamos então neste trabalho uma proposta de arquitetura híbrida de integração que permite a convivência dos modelos relacional e de grafos em sua forma nativa, reduzindo o impacto de adaptações em bases relacionais preexistentes e explorando as vantagens de cada modelo nativo nas operações de gerenciamento e recuperação.
|
Santanchè, André;
Longo, João Sávio C.;
Jomier, Geneviève;
Zam, Michel;
Medeiros, Claudia Bauzer
Multi-focus Research and Geospatial Data - anthropocentric concerns (article)
JIDM - Journal of Information and Data Management,
2,
2014.
(
Abstract |
Links |
BibTeX |
Tags:
Geospatial data, Multiple aspects, Multiscale, Multiscale views, Version
)
@article{SantancheMedeirosJomier2014,
abstract = {Work on multiscale issues presents countless challenges that have been long attacked by GIScience researchers. Research is usually concentrated in one of two directions - new data models to support handling multiple scales, or data structures and algorithms to process data across scales. Complementary implementation aspects are concerned with generalization (and/or virtualization of distinct scales), or with linking entities of interest across scales (e.g., using bottom-up implementation of specific structures, without relying on any specific DBMS). However, researchers seldom take into account the fact that multiscale scenarios are increasingly constructed cooperatively, and require distinct perspectives of the world, in which each research group considers specific aspects of a problem. The combination of handling multiple scales at a time, and having multiple user perspectives per scale constitutes what we call multi-focus research. This paper presents our proposal to attack multi-focus scenarios, which considers distinct aspects of the problem of managing multiple scales, illustrated with examples of multiscale geospatial data. Our approach builds upon a specific database version model – the so-called multiversion MVDB – which has already been successfully implemented in several geospatial scenarios, being extended here to support multi-focus research. This extension was implemented and tested in a real world case study, briefly discussed here.},
author = {André Santanchè and João Sávio C. Longo and Geneviève Jomier and Michel Zam and Claudia Bauzer Medeiros},
date = {2014-09-18},
journal = {JIDM - Journal of Information and Data Management},
keyword = {Geospatial data, Multiple aspects, Multiscale, Multiscale views, Version},
link = {https://seer.lcc.ufmg.br/index.php/jidm/article/view/418/626},
number = {2},
pages = {146-160},
title = {Multi-focus Research and Geospatial Data - anthropocentric concerns},
volume = {5},
year = {2014}
}
Work on multiscale issues presents countless challenges that have been long attacked by GIScience researchers. Research is usually concentrated in one of two directions - new data models to support handling multiple scales, or data structures and algorithms to process data across scales. Complementary implementation aspects are concerned with generalization (and/or virtualization of distinct scales), or with linking entities of interest across scales (e.g., using bottom-up implementation of specific structures, without relying on any specific DBMS). However, researchers seldom take into account the fact that multiscale scenarios are increasingly constructed cooperatively, and require distinct perspectives of the world, in which each research group considers specific aspects of a problem. The combination of handling multiple scales at a time, and having multiple user perspectives per scale constitutes what we call multi-focus research. This paper presents our proposal to attack multi-focus scenarios, which considers distinct aspects of the problem of managing multiple scales, illustrated with examples of multiscale geospatial data. Our approach builds upon a specific database version model – the so-called multiversion MVDB – which has already been successfully implemented in several geospatial scenarios, being extended here to support multi-focus research. This extension was implemented and tested in a real world case study, briefly discussed here.
|
Santo, Jacqueline Midlej do Espírito;
Medeiros, Claudia Bauzer
Complex Pattern Detection and Specification from Multiscale Environmental Variables for Biodiversity Applications (conference)
Proc. of CSBC 2014 - BreSci,
Sociedade Brasileira de Computação (SBC),
Brasília - DF,
2014.
(
Abstract |
Links |
BibTeX |
Tags:
Complex Event Processing, Conference, Pattern Detection
)
@conference{SantoBRESCI2014,
abstract = {Biodiversity scientists often need to define and detect scenarios of interest from data streams concern meteorological sensors. Such streams are
characterized by their heterogeneity across spatial and temporal scales, which hampers construction of scenarios. To help them in this task, this paper proposes the use of the theory of Complex Event Processing (CEP) to detect complex event patterns in this context.},
address = {Brasília - DF},
author = {Jacqueline Midlej do Espírito Santo and Claudia Bauzer Medeiros},
booktitle = {Proc. of CSBC 2014 - BreSci},
date = {2014-07-31},
editor = {Eduardo Adilio Pelinson Alchieri and Priscila Solís Barreto},
issn = {2175-2761},
keyword = {Complex Event Processing, Conference, Pattern Detection},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/04/bresciFinal.pdf},
note = {Short-paper},
pages = {389-392},
publisher = {Sociedade Brasileira de Computação (SBC)},
title = {Complex Pattern Detection and Specification from Multiscale Environmental Variables for Biodiversity Applications},
year = {2014}
}
Biodiversity scientists often need to define and detect scenarios of interest from data streams concern meteorological sensors. Such streams are characterized by their heterogeneity across spatial and temporal scales, which hampers construction of scenarios. To help them in this task, this paper proposes the use of the theory of Complex Event Processing (CEP) to detect complex event patterns in this context.
|
Batista, Lucas Oliveira;
Medeiros, Claudia Bauzer
Supporting the Study of Correlations between Time Series via Semantic Annotations (conference)
Proc. CSBC 2014 - VIII Brazilian e-Science Workshop (BRESCI),
Sociedade Brasileira de Computação (SBC),
Brasília - DF, Brasil,
2014.
(
Abstract |
Links |
BibTeX |
Tags:
Semantic Annotation, time series
)
@conference{BatistaMedeiros2014,
abstract = {This paper shows a work in progress to design and develop a software framework that supports experts in the correlation of time series. It will allow searching for time series via semantic annotations. Thereby fostering collaboration among experts, and aggregate knowledge to content.},
address = {Brasília - DF, Brasil},
author = {Lucas Oliveira Batista and Claudia Bauzer Medeiros},
booktitle = {Proc. CSBC 2014 - VIII Brazilian e-Science Workshop (BRESCI)},
date = {2014-07-31},
editor = {Eduardo Adilio Pelinson Alchieri and Priscila Solís Barreto},
issn = {2175-2761},
keyword = {Semantic Annotation, time series},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/04/LucasBatistaBresciFinal.pdf},
note = {Short-paper},
pages = {385-388},
publisher = {Sociedade Brasileira de Computação (SBC)},
title = {Supporting the Study of Correlations between Time Series via Semantic Annotations},
year = {2014}
}
This paper shows a work in progress to design and develop a software framework that supports experts in the correlation of time series. It will allow searching for time series via semantic annotations. Thereby fostering collaboration among experts, and aggregate knowledge to content.
|
Miranda, Eduardo;
Grand, Anaıs;
Lebbe, Régine Vignes;
Santanchè, André
Towards a Linked Biology - An integrated perspective of phenotypes and phylogenetic trees (conference)
10th International Conference on Data Integration in the Life Sciences,
Lisbon, Portugal,
10th International Conference on Data Integration in the Life Sciences,
2014.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Miranda2014b,
abstract = {A large number of studies in biology, including those involv- ing phylogenetic tree reconstruction, result in the production of a huge amount of data – e.g., phenotype descriptions, morphological data ma- trices, etc. Biologists increasingly face a challenge and opportunity of effectively discovering useful knowledge by crossing and comparing sev- eral pieces of information, not always linked and integrated. Our moti- vation stems from the idea of transforming these data into a network of relationships, looking for links among related elements and enhanc- ing the ability to solve more complex problems supported by machines. This work addresses this problem through a graph database model, link- ing and coupling phylogenetic trees and phenotype descriptions. In this paper we give an overview of an experiment exploiting the synergy of linked data sources to support biologists in data analysis, comparison and inferences.},
address = {Lisbon, Portugal},
author = {Eduardo Miranda and Anaıs Grand and Régine Vignes Lebbe and André Santanchè},
date = {2014-07-17},
journal = {10th International Conference on Data Integration in the Life Sciences},
keyword = {Conference},
link = {http://dils2014.inesc-id.pt/data/uploads/paper_38.pdf},
organization = {DILS 2014},
publisher = {10th International Conference on Data Integration in the Life Sciences},
title = {Towards a Linked Biology - An integrated perspective of phenotypes and phylogenetic trees},
year = {2014}
}
A large number of studies in biology, including those involv- ing phylogenetic tree reconstruction, result in the production of a huge amount of data – e.g., phenotype descriptions, morphological data ma- trices, etc. Biologists increasingly face a challenge and opportunity of effectively discovering useful knowledge by crossing and comparing sev- eral pieces of information, not always linked and integrated. Our moti- vation stems from the idea of transforming these data into a network of relationships, looking for links among related elements and enhanc- ing the ability to solve more complex problems supported by machines. This work addresses this problem through a graph database model, link- ing and coupling phylogenetic trees and phenotype descriptions. In this paper we give an overview of an experiment exploiting the synergy of linked data sources to support biologists in data analysis, comparison and inferences.
|
Daltio, Jaudete;
Medeiros, Claudia Bauzer
Handling Multiple Foci in Graph Databases (conference)
Lecture Notes in Bioinformatics (LNBI) - Proceedings of 10th International Conference on Data Integration in the Life Sciences,
Lisboa, Portugal,
2014.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Daltio2014,
abstract = {Scientific research has become data-intensive and data-dependent, with distributed, multidisciplinary, teams creating and sharing their findings. Graph databases are being increasingly considered as a computational means to loosely integrate such data, in particular when relationships among data and the data itself are at the same importance level. However, a problem to be faced in this context is that of multiple foci – where a focus, here, is a perspective on the data, for a particular research team and context. This paper describes a conceptual framework for the construction of arbitrary foci on graph databases, to help solve this problem. The framework, under construction, is illustrated using examples based on needs of teams involved in biodiversity research.},
address = {Lisboa, Portugal},
author = {Jaudete Daltio and Claudia Bauzer Medeiros},
booktitle = {Lecture Notes in Bioinformatics (LNBI) - Proceedings of 10th International Conference on Data Integration in the Life Sciences},
date = {2014-07-17},
editor = {Springer International Publishing Switzerland},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/04/Handling-Multiple-Foci-in-Graph-Databases.pdf},
pages = {58-65},
title = {Handling Multiple Foci in Graph Databases},
volume = {8574},
year = {2014}
}
Scientific research has become data-intensive and data-dependent, with distributed, multidisciplinary, teams creating and sharing their findings. Graph databases are being increasingly considered as a computational means to loosely integrate such data, in particular when relationships among data and the data itself are at the same importance level. However, a problem to be faced in this context is that of multiple foci – where a focus, here, is a perspective on the data, for a particular research team and context. This paper describes a conceptual framework for the construction of arbitrary foci on graph databases, to help solve this problem. The framework, under construction, is illustrated using examples based on needs of teams involved in biodiversity research.
|
Cugler, Daniel Cintra
Supporting the collection and curation of biological observation metadata (phdthesis)
Universidade Estadual de Campinas - UNICAMP,
phdthesis,
2014.
(
Abstract |
Links |
BibTeX |
Tags:
Biodiversity Information Systems
)
@phdthesis{Cugler2014,
abstract = {Biological observation databases contain information about the occurrence of an organism or set of organisms detected at a given place and time according to some methodology. Such databases store a variety of data, at multiple spatial and temporal scales, including images, maps, sounds, texts and so on. This priceless information can be used in a wide range of research initiatives, e.g., global warming, species behavior or food production. All such studies are based on analyzing the records themselves, and their metadata. Most times, analyses start from metadata, often used to index the observation records. However, given the nature of observation activities, metadata may suffer from quality problems, hampering such analyses. For example, there may be metadata gaps (e.g., missing attributes, or insufficient records). This can have serious effects: in biodiversity studies, for instance, metadata problems regarding a single species can affect the understanding not just of the species, but of wider ecological interactions. This thesis proposes a set of processes to help solve problems in metadata quality. While previous approaches concern one given aspect of the problem, the thesis provides an architecture and algorithms that encompass the whole cycle of managing biological observation metadata, which goes from acquiring data to retrieving database records. Our contributions are divided into two categories: (a) data enrichment and (b) data cleaning. Contributions in category (a) provide additional information for both missing attributes in existent records, and missing records for specific requirements. Our strategies use authoritative remote data sources and VGI (Volunteered Geographic Information) to enrich such metadata, providing missing information. Contributions in category (b) detect anomalies in biological observation metadata by performing spatial analyses that contrast location of the observations with authoritative geographic distribution maps. Thus, the main contributions are: (i) an architecture to retrieve biological observation records, which derives missing attributes by using external data sources; (ii) a geographical approach for anomaly detection and (iii) an approach for adaptive acquisition of VGI to fill out metadata gaps, using mobile devices and sensors. These contributions were validated by actual implementations, using as case study the challenges presented by the management of biological observation metadata of the Fonoteca Neotropical Jacques Vielliard (FNJV), one of the top 10 animal sound collections in the world.},
author = {Daniel Cintra Cugler},
date = {2014-05-08},
keyword = {Biodiversity Information Systems},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2017/01/CuglerDanielCintra_D.pdf},
note = {Supervisor Claudia Bauzer Medeiros},
school = {Universidade Estadual de Campinas - UNICAMP},
title = {Supporting the collection and curation of biological observation metadata},
year = {2014}
}
Biological observation databases contain information about the occurrence of an organism or set of organisms detected at a given place and time according to some methodology. Such databases store a variety of data, at multiple spatial and temporal scales, including images, maps, sounds, texts and so on. This priceless information can be used in a wide range of research initiatives, e.g., global warming, species behavior or food production. All such studies are based on analyzing the records themselves, and their metadata. Most times, analyses start from metadata, often used to index the observation records. However, given the nature of observation activities, metadata may suffer from quality problems, hampering such analyses. For example, there may be metadata gaps (e.g., missing attributes, or insufficient records). This can have serious effects: in biodiversity studies, for instance, metadata problems regarding a single species can affect the understanding not just of the species, but of wider ecological interactions. This thesis proposes a set of processes to help solve problems in metadata quality. While previous approaches concern one given aspect of the problem, the thesis provides an architecture and algorithms that encompass the whole cycle of managing biological observation metadata, which goes from acquiring data to retrieving database records. Our contributions are divided into two categories: (a) data enrichment and (b) data cleaning. Contributions in category (a) provide additional information for both missing attributes in existent records, and missing records for specific requirements. Our strategies use authoritative remote data sources and VGI (Volunteered Geographic Information) to enrich such metadata, providing missing information. Contributions in category (b) detect anomalies in biological observation metadata by performing spatial analyses that contrast location of the observations with authoritative geographic distribution maps. Thus, the main contributions are: (i) an architecture to retrieve biological observation records, which derives missing attributes by using external data sources; (ii) a geographical approach for anomaly detection and (iii) an approach for adaptive acquisition of VGI to fill out metadata gaps, using mobile devices and sensors. These contributions were validated by actual implementations, using as case study the challenges presented by the management of biological observation metadata of the Fonoteca Neotropical Jacques Vielliard (FNJV), one of the top 10 animal sound collections in the world.
|
Bernardo, Ivelize Rocha;
Santanchè, André;
Baranauskas, Maria Cecília Calani
Automatic Interpretation Biodiversity Spreadsheets Based on Recognition of Construction Patterns (conference)
Proceedings of the 16th International Conference on Enterprise Information Systems (ICEIS 2014),
2014.
(
Abstract |
Links |
BibTeX |
Tags:
Biodiversity data integration, Information Integration, Patterns Recognition, Semantic mapping, Spreadsheet interpretation
)
@conference{Bernardo2014,
abstract = {Spreadsheets are widely adopted as "popular databases", where authors shape their solutions interactively. Although spreadsheets have characteristics that facilitate their adaptation by the author, they are not designed to integrate data across independent spreadsheets. In biology, we observed a significant amount of biodiversity data in spreadsheets treated as isolated entities with different tabular organizations, but with high potential for data articulation. In order to promote interoperability among these spreadsheets, we propose in this paper a technique based on pattern recognition of spreadsheets belonging to the biodiversity domain. It can be exploited to identify the spreadsheet in a higher level of abstraction – e.g., it is possible to identify the nature a spreadsheet as catalog or collection of specimen – improving the interoperability process. The paper details evidences of construction patterns of spreadsheets as well as proposes a semantic representation to them.},
author = {Ivelize Rocha Bernardo and André Santanchè and Maria Cecília Calani Baranauskas},
booktitle = {Proceedings of the 16th International Conference on Enterprise Information Systems (ICEIS 2014)},
date = {2014-04-27},
keyword = {Biodiversity data integration, Information Integration, Patterns Recognition, Semantic mapping, Spreadsheet interpretation},
link = {http://www.scitepress.org/DigitalLibrary/PublicationsDetail.aspx?ID=oEnR7oWqnHw=&t=1},
pages = {57-68},
title = {Automatic Interpretation Biodiversity Spreadsheets Based on Recognition of Construction Patterns},
year = {2014}
}
Spreadsheets are widely adopted as 'popular databases', where authors shape their solutions interactively. Although spreadsheets have characteristics that facilitate their adaptation by the author, they are not designed to integrate data across independent spreadsheets. In biology, we observed a significant amount of biodiversity data in spreadsheets treated as isolated entities with different tabular organizations, but with high potential for data articulation. In order to promote interoperability among these spreadsheets, we propose in this paper a technique based on pattern recognition of spreadsheets belonging to the biodiversity domain. It can be exploited to identify the spreadsheet in a higher level of abstraction – e.g., it is possible to identify the nature a spreadsheet as catalog or collection of specimen – improving the interoperability process. The paper details evidences of construction patterns of spreadsheets as well as proposes a semantic representation to them.
|
Senra, Rodrigo Dias Arruda;
Medeiros, Claudia Bauzer
Evaluate, Reorganize and Share: An Approach to Dynamically Organize Digital Hierarchies (article)
International Journal of Metadata, Semantics and Ontologies,
2014.
(
Abstract |
Links |
BibTeX |
Tags:
Data integration, Data sharing, Organization, Organograph, Personal Information Management
)
@article{SenraMedeiros2014,
abstract = {We are overwhelmed and overloaded with the data deluge brought by the digital age. Hierarchies are pervasive cognitive patterns that allow us to reorganize data and reduce the dimensionality of the information space to manageable levels (e.g., filesystems and navigational menus). In spite of their widespread adoption, such hierarchies can be improved to cope with the present needs of data sharing and reuse. First, we seldom use mechanisms to evaluate how well they partition the information space. Second, we build static and content-driven hierarchies instead of dynamic and context-driven (i.e., task-driven) ones. Third, we use ad hoc and implicit hierarchization criteria, whereas they should be explicit and shareable. This paper discusses the problems related to the construction of hierarchies, and presents a conceptual framework to turn them into reconfigurable and shareable artifacts. Moreover, it explores how dynamically reconfigurable hierarchies can better cope with the multi-faceted nature of content, illustrating these principles through a tool that validates our proposal.},
author = {Rodrigo Dias Arruda Senra and Claudia Bauzer Medeiros},
date = {2014-04-16},
journal = {International Journal of Metadata, Semantics and Ontologies},
keyword = {Data integration, Data sharing, Organization, Organograph, Personal Information Management},
link = {http://link.springer.com/article/10.1007%2Fs13740-014-0035-7},
pages = {15-28},
title = {Evaluate, Reorganize and Share: An Approach to Dynamically Organize Digital Hierarchies},
volume = {9},
year = {2014}
}
We are overwhelmed and overloaded with the data deluge brought by the digital age. Hierarchies are pervasive cognitive patterns that allow us to reorganize data and reduce the dimensionality of the information space to manageable levels (e.g., filesystems and navigational menus). In spite of their widespread adoption, such hierarchies can be improved to cope with the present needs of data sharing and reuse. First, we seldom use mechanisms to evaluate how well they partition the information space. Second, we build static and content-driven hierarchies instead of dynamic and context-driven (i.e., task-driven) ones. Third, we use ad hoc and implicit hierarchization criteria, whereas they should be explicit and shareable. This paper discusses the problems related to the construction of hierarchies, and presents a conceptual framework to turn them into reconfigurable and shareable artifacts. Moreover, it explores how dynamically reconfigurable hierarchies can better cope with the multi-faceted nature of content, illustrating these principles through a tool that validates our proposal.
|
Vilar, Bruno Siqueira Campos Mendonça
Context driven workflow adaptation applied to healthcare planning (phdthesis)
Instituto de Computação - Universidade Estadual de Campinas (UNICAMP),
Campinas - SP,
phdthesis,
2014.
(
Abstract |
Links |
BibTeX |
Tags:
Computer software, Health planning, Hospitals, Workflow, Workflow management systems
)
@phdthesis{vilar2014,
abstract = {Workflow Management Systems (WfMS) are used to manage the execution of processes, improving efficiency and efficacy of the procedure in use. The driving forces behind the adoption and development of WfMSs are business and scientific applications. Associated research efforts resulted in consolidated mechanisms, consensual protocols and standards. In particular, a scientific WfMS helps scientists to specify and run distributed experiments. It provides several features that support activities within an experimental environment, such as providing flexibility to change workflow design and keeping provenance (and thus reproducibility) of experiments. On the other hand, barring a few research initiatives, WfMSs do not provide appropriate support to dynamic, context-based customization during run-time; on-the-fly adaptations usually require user intervention. This thesis is concerned with mending this gap, providing WfMSs with a context-aware mechanism to dynamically customize workflow execution. As a result, we designed and developed DynFlow - a software architecture that allows such a customization, applied to a specific domain: healthcare planning. This application domain was chosen because it is a very good example of context-sensitive customization. Indeed, healthcare procedures constantly undergo unexpected changes that may occur during a treatment, such as a patient¿s reaction to a medicine. To meet dynamic customization demands, healthcare planning research has developed semi-automated techniques to support fast changes of the careflow steps according to a patient's state and evolution. One such technique is Computer-Interpretable Guidelines (CIG), whose most prominent member is the Task-Network Model (TNM) -- a rule based approach able to build on the fly a plan according to the context. Our research led us to conclude that CIGs do not support features required by health professionals, such as distributed execution, provenance and extensibility, which are available from WfMSs. In other words, CIGs and WfMSs have complementary characteristics, and both are directed towards execution of activities. Given the above facts, the main contributions of the thesis are the following: (a) the design and development of DynFlow, whose underlying model blends TNM characteristics with WfMS; (b) the characterization of the main advantages and disadvantages of CIG models and workflow models; and (c) the implementation of a prototype, based on ontologies, applied to nursing care. Ontologies are used as a solution to enable interoperability across distinct SWfMS internal representations, as well as to support distinct healthcare vocabularies and procedures.},
address = {Campinas - SP},
author = {Bruno Siqueira Campos Mendonça Vilar},
date = {2014-04-14},
keyword = {Computer software, Health planning, Hospitals, Workflow, Workflow management systems},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/09/VilarBrunoSiqueiraCamposMendonça_D.pdf},
school = {Instituto de Computação - Universidade Estadual de Campinas (UNICAMP)},
title = {Context driven workflow adaptation applied to healthcare planning},
year = {2014}
}
Workflow Management Systems (WfMS) are used to manage the execution of processes, improving efficiency and efficacy of the procedure in use. The driving forces behind the adoption and development of WfMSs are business and scientific applications. Associated research efforts resulted in consolidated mechanisms, consensual protocols and standards. In particular, a scientific WfMS helps scientists to specify and run distributed experiments. It provides several features that support activities within an experimental environment, such as providing flexibility to change workflow design and keeping provenance (and thus reproducibility) of experiments. On the other hand, barring a few research initiatives, WfMSs do not provide appropriate support to dynamic, context-based customization during run-time; on-the-fly adaptations usually require user intervention. This thesis is concerned with mending this gap, providing WfMSs with a context-aware mechanism to dynamically customize workflow execution. As a result, we designed and developed DynFlow - a software architecture that allows such a customization, applied to a specific domain: healthcare planning. This application domain was chosen because it is a very good example of context-sensitive customization. Indeed, healthcare procedures constantly undergo unexpected changes that may occur during a treatment, such as a patient¿s reaction to a medicine. To meet dynamic customization demands, healthcare planning research has developed semi-automated techniques to support fast changes of the careflow steps according to a patient's state and evolution. One such technique is Computer-Interpretable Guidelines (CIG), whose most prominent member is the Task-Network Model (TNM) -- a rule based approach able to build on the fly a plan according to the context. Our research led us to conclude that CIGs do not support features required by health professionals, such as distributed execution, provenance and extensibility, which are available from WfMSs. In other words, CIGs and WfMSs have complementary characteristics, and both are directed towards execution of activities. Given the above facts, the main contributions of the thesis are the following: (a) the design and development of DynFlow, whose underlying model blends TNM characteristics with WfMS; (b) the characterization of the main advantages and disadvantages of CIG models and workflow models; and (c) the implementation of a prototype, based on ontologies, applied to nursing care. Ontologies are used as a solution to enable interoperability across distinct SWfMS internal representations, as well as to support distinct healthcare vocabularies and procedures.
|
Sousa, Renato Beserra;
Cugler, Daniel Cintra;
Malaverri, Joana Gonzales E.;
Medeiros, Claudia Bauzer
A Provenance-Based Approach to Manage Long Term Preservation of Scientific Data (conference)
2014 IEEE 30th International Conference on Data Engineering Workshops (ICDEW),
978-1-4799-3481-2,
2014.
(
Abstract |
Links |
BibTeX |
Tags:
Data quality
)
@conference{SousaMedeiros2014,
abstract = {Long term preservation of scientific data goes beyond the data, and extends to metadata preservation and curation. While several researchers emphasize curation processes, our work is geared towards assessing the quality of scientific (meta)data. The rationale behind this strategy is that scientific data are often accessible via metadata - and thus ensuring metadata quality is a means to provide long term accessibility. This paper discusses our quality assessment architecture, presenting a case study on animal sound recording metadata. Our case study is an example of the importance of periodically assessing (meta)data quality, since knowledge about the world may evolve, and quality decrease with time, hampering long term preservation.},
author = {Renato Beserra Sousa and Daniel Cintra Cugler and Joana Gonzales E. Malaverri and Claudia Bauzer Medeiros},
booktitle = {2014 IEEE 30th International Conference on Data Engineering Workshops (ICDEW)},
date = {2014-03-06},
isbn = {978-1-4799-3481-2},
keyword = {Data quality},
link = {http://ieeexplore.ieee.org/document/6818316/},
title = {A Provenance-Based Approach to Manage Long Term Preservation of Scientific Data},
year = {2014}
}
Long term preservation of scientific data goes beyond the data, and extends to metadata preservation and curation. While several researchers emphasize curation processes, our work is geared towards assessing the quality of scientific (meta)data. The rationale behind this strategy is that scientific data are often accessible via metadata - and thus ensuring metadata quality is a means to provide long term accessibility. This paper discusses our quality assessment architecture, presenting a case study on animal sound recording metadata. Our case study is an example of the importance of periodically assessing (meta)data quality, since knowledge about the world may evolve, and quality decrease with time, hampering long term preservation.
|
Miranda, Eduardo;
Santanchè, André
Linked biology technical aspects - linking phenotypes and phylogenetic trees. (Technical Report)
Institute of Computing, State University of Campinas,
Technical Report,
IC-14-06,
2014.
(
Abstract |
Links |
BibTeX |
Tags:
Techreport
)
@techreport{Miranda2014,
abstract = {A large number of studies in biology, including those involving phylogenetic trees reconstruction, result in the production of a huge amount of data - e.g., phenotype descriptions, morphological data matrices, etc. Biologists increasingly face a challenge and opportunity of effectively discovering useful knowledge crossing and comparing several pieces of information, not always linked and integrated. Ontologies are one of the promising choices to address this challenge. However, the existing digital phenotypic descriptions are stored in semi-structured formats, making extensive use of natural language. This technical report is related to a research developed by us [] to addresses this problem, adding an intermediate step between semi-structured phenotypic descriptions and ontologies. It remodels semi-structured descriptions to a graph abstraction in which the data are linked. Graph transformations subsidize the transition from semi-structured data representation to a more formalized representation with ontologies. The present technical report drills down implementation details of our system. It provides a module to ingest phylogenetic trees and phenotype descriptions - represented in semi-structured formats - into a graph database. Additionally, two approaches to combine distinct data sources are presented and an algorithm to trace changes in phylogenetic traits of trees.},
author = {Eduardo Miranda and André Santanchè},
date = {2014-02-01},
institution = {Institute of Computing, State University of Campinas},
keyword = {Techreport},
link = {http://www.ic.unicamp.br/~reltech/2014/14-06.pdf},
number = {IC-14-06},
pages = {56},
title = {Linked biology technical aspects - linking phenotypes and phylogenetic trees.},
type = {Technical Report},
year = {2014}
}
A large number of studies in biology, including those involving phylogenetic trees reconstruction, result in the production of a huge amount of data - e.g., phenotype descriptions, morphological data matrices, etc. Biologists increasingly face a challenge and opportunity of effectively discovering useful knowledge crossing and comparing several pieces of information, not always linked and integrated. Ontologies are one of the promising choices to address this challenge. However, the existing digital phenotypic descriptions are stored in semi-structured formats, making extensive use of natural language. This technical report is related to a research developed by us [] to addresses this problem, adding an intermediate step between semi-structured phenotypic descriptions and ontologies. It remodels semi-structured descriptions to a graph abstraction in which the data are linked. Graph transformations subsidize the transition from semi-structured data representation to a more formalized representation with ontologies. The present technical report drills down implementation details of our system. It provides a module to ingest phylogenetic trees and phenotype descriptions - represented in semi-structured formats - into a graph database. Additionally, two approaches to combine distinct data sources are presented and an algorithm to trace changes in phylogenetic traits of trees.
|
Malaverri, Joana E. Gonzales;
Santanchè, André;
Medeiros, Claudia Bauzer
A provenance-based approach to evaluate data quality in eScience (article)
Inderscience,
International Journal of Metadata, Semantics and Ontologies,
1,
2014.
(
Abstract |
Links |
BibTeX |
Tags:
Data quality
)
@article{Malaverri2014,
abstract = {Data quality is growing in relevance as a research topic. Quality assessment has been progressively incorporated in many business environments, and in software engineering practices. eScience environments, however, because of the multiplicity and heterogeneity of data sources and scientific experts involved in a given problem, complicate data quality assessment. This paper deals with the evaluation of the quality of data managed by eScience applications. Our approach is based on data provenance, i.e. the history of the origins and transformations applied to a given data product. Our contributions include a the specification of a framework to track data provenance and use it to derive quality information, b a model for data provenance based on the Open Provenance Model, and c a methodology to evaluate the quality of data based on its provenance. Our proposal is validated experimentally by a prototype that takes advantage of the Taverna workflow system.},
author = {Joana E. Gonzales Malaverri and André Santanchè and Claudia Bauzer Medeiros},
date = {2014-02-01},
journal = {International Journal of Metadata, Semantics and Ontologies},
keyword = {Data quality},
link = {http://dl.acm.org/citation.cfm?id=2579580},
number = {1},
pages = {15-18},
publisher = {Inderscience},
title = {A provenance-based approach to evaluate data quality in eScience},
volume = {9},
year = {2014}
}
Data quality is growing in relevance as a research topic. Quality assessment has been progressively incorporated in many business environments, and in software engineering practices. eScience environments, however, because of the multiplicity and heterogeneity of data sources and scientific experts involved in a given problem, complicate data quality assessment. This paper deals with the evaluation of the quality of data managed by eScience applications. Our approach is based on data provenance, i.e. the history of the origins and transformations applied to a given data product. Our contributions include a the specification of a framework to track data provenance and use it to derive quality information, b a model for data provenance based on the Open Provenance Model, and c a methodology to evaluate the quality of data based on its provenance. Our proposal is validated experimentally by a prototype that takes advantage of the Taverna workflow system.
|
2013 |
Miranda, Eduardo de Paula
Linked biology — from phenotypes towardsphylogenetic trees (mastersthesis)
Instituto de Computação - Unicamp,
mastersthesis,
2013.
(
Abstract |
Links |
BibTeX |
Tags:
Mastersthesis
)
@mastersthesis{Miranda2013,
abstract = {A large number of studies in biology, including those involving phylogenetic trees reconstruction, result in the production of a huge amount of data -- e.g., phenotype descriptions, morphological data matrices, phylogenetic trees, etc. Biologists increasingly face a challenge and opportunity of effectively discovering useful knowledge crossing and comparing several pieces of information, not always linked and integrated. In this work, we are interested in a specific biology context, in which biologists apply computational tools to build and share digital descriptions of living beings. We propose a process that departs from fragmentary data sources, which we map to graphs, towards a full integration of descriptions through ontologies. Graph databases mediate this evolvement process. They are less schema dependent and, since an ontology is also a graph, the mapping process from the initial graph towards an ontology becomes a sequence of graph transformations. Our motivation stems from the idea that transforming phenotypical descriptions in a network of relationships and looking for links among related elements will enhance the ability of solving more complex problems supported by machines. This work details the design principles behind our process and two practical implementations as proof of concept.},
author = {Eduardo de Paula Miranda},
date = {2013-11-22},
keyword = {Mastersthesis},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/04/Eduardo-Miranda-M.-Sc.-Dissertation.pdf},
school = {Instituto de Computação - Unicamp},
title = {Linked biology — from phenotypes towardsphylogenetic trees},
year = {2013}
}
A large number of studies in biology, including those involving phylogenetic trees reconstruction, result in the production of a huge amount of data -- e.g., phenotype descriptions, morphological data matrices, phylogenetic trees, etc. Biologists increasingly face a challenge and opportunity of effectively discovering useful knowledge crossing and comparing several pieces of information, not always linked and integrated. In this work, we are interested in a specific biology context, in which biologists apply computational tools to build and share digital descriptions of living beings. We propose a process that departs from fragmentary data sources, which we map to graphs, towards a full integration of descriptions through ontologies. Graph databases mediate this evolvement process. They are less schema dependent and, since an ontology is also a graph, the mapping process from the initial graph towards an ontology becomes a sequence of graph transformations. Our motivation stems from the idea that transforming phenotypical descriptions in a network of relationships and looking for links among related elements will enhance the ability of solving more complex problems supported by machines. This work details the design principles behind our process and two practical implementations as proof of concept.
|
Miranda, Eduardo;
Grand, Anaıs;
Lebbe, Régine Vignes;
Santanchè, André
Coupling phenotype descriptions and phylogenetic trees: from SDD to ontologies via graph databases (conference)
TDWG 2013 Annual Conference,
Florença, Italy,
2013.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Miranda2013b,
abstract = {Characters are at the heart of the taxonomist’s tasks: discovering, describing, naming, comparing, characterizing new taxa, classifying them according to their phylogenetic relationships and studying their history, diversity and distribution. Taxonomic works result in the production of a huge amount of data (e.g., phenotype descriptions, morphological data matrices, etc.) stated in free-text format and digitally represented in many semi-structured standards, not often able to be interconnected. However, a semantic framework is needed for the integration of characters across studies, wherein ontologies are one of the promising choices to address this challenge. We face two challenges in this context: (i) how to relate several unconnected ontologies to be used in ontology-based descriptions; and (ii) how to map/reuse the huge amount of existing resources developed pre-ontologies. To address (i), we present a semantic representation of characters, with a unifying meta-model that can be superimposed over existing bio-ontologies, disciplining their relations and favoring their integration. Given the fact that converting taxonomic data in ontologies is not a straightforward task, to address (ii) we are implementing an intermediate step between semi-structured phenotypic descriptions and ontologies, based on graph databases. In the Semantic Web context, an ontology in RDF (Resource Description Framework) /OWL (Ontology Web Language) is essentially a graph where the nodes and relations are objects and properties following some class model. Texts and labels in natural language will appear as complementary documentation for human consumption. We mapped the SDD (Structured Descriptive Data) format to the graph model, remodeling semi-structured descriptions to a graph abstraction, in which the data are linked, enabling coupling phylogenetic trees and phenotype descriptions. Graph databases are less schema dependent and, since an ontology is also a graph, the mapping from the original graph towards an ontology becomes a sequence of graph transformations. This graph model was designed to be published on the Web in a Linked Data approach. Practical experiments are illustrated with the study of fossil ferns, using the programs Xper2 (for descriptions), which is compatible with the SDD standards, and LisBeth (for phylogenetics).},
address = {Florença, Italy},
author = {Eduardo Miranda and Anaıs Grand and Régine Vignes Lebbe and André Santanchè},
date = {2013-11-01},
keyword = {Conference},
link = {https://mbgserv18.mobot.org/ocs/index.php/tdwg/2013/paper/view/404},
publisher = {TDWG 2013 Annual Conference},
title = {Coupling phenotype descriptions and phylogenetic trees: from SDD to ontologies via graph databases},
year = {2013}
}
Characters are at the heart of the taxonomist’s tasks: discovering, describing, naming, comparing, characterizing new taxa, classifying them according to their phylogenetic relationships and studying their history, diversity and distribution. Taxonomic works result in the production of a huge amount of data (e.g., phenotype descriptions, morphological data matrices, etc.) stated in free-text format and digitally represented in many semi-structured standards, not often able to be interconnected. However, a semantic framework is needed for the integration of characters across studies, wherein ontologies are one of the promising choices to address this challenge. We face two challenges in this context: (i) how to relate several unconnected ontologies to be used in ontology-based descriptions; and (ii) how to map/reuse the huge amount of existing resources developed pre-ontologies. To address (i), we present a semantic representation of characters, with a unifying meta-model that can be superimposed over existing bio-ontologies, disciplining their relations and favoring their integration. Given the fact that converting taxonomic data in ontologies is not a straightforward task, to address (ii) we are implementing an intermediate step between semi-structured phenotypic descriptions and ontologies, based on graph databases. In the Semantic Web context, an ontology in RDF (Resource Description Framework) /OWL (Ontology Web Language) is essentially a graph where the nodes and relations are objects and properties following some class model. Texts and labels in natural language will appear as complementary documentation for human consumption. We mapped the SDD (Structured Descriptive Data) format to the graph model, remodeling semi-structured descriptions to a graph abstraction, in which the data are linked, enabling coupling phylogenetic trees and phenotype descriptions. Graph databases are less schema dependent and, since an ontology is also a graph, the mapping from the original graph towards an ontology becomes a sequence of graph transformations. This graph model was designed to be published on the Web in a Linked Data approach. Practical experiments are illustrated with the study of fossil ferns, using the programs Xper2 (for descriptions), which is compatible with the SDD standards, and LisBeth (for phylogenetics).
|
Cugler, Daniel Cintra;
Medeiros, Claudia Bauzer;
Shekhar, Shashi;
Toledo, Luís Felipe
A Geographical Approach for Metadata Quality Improvement in Biological Observation Databases (conference)
9th IEEE International Conference on e-Science,
2013.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Cugler2013,
abstract = {This paper addresses the problem of improving the quality of metadata in biological observation databases, in particular those associated with observations of living beings, and which are often used as a starting point for biodiversity analyses. Poor quality metadata lead to incorrect scientific conclusions, and can mislead experts in their analyses. Thus, it is important to design and develop methods to detect and correct metadata quality problems. This is a challenging problem because of the variety of issues concerning such metadata, e.g., misnaming of species, location uncertainty and imprecision concerning where observations were recorded. Related work is limited because it does not adequately model such issues. We propose a geographic approach based on expert-led classification of place and/or range mismatch anomalies detected by our algorithms. Our work is tested using a case study with the Fonoteca Neotropical Jacques Vielliard, one of the 10 largest animal sound collections in the world.},
author = {Daniel Cintra Cugler and Claudia Bauzer Medeiros and Shashi Shekhar and Luís Felipe Toledo},
booktitle = {9th IEEE International Conference on e-Science},
date = {2013-10-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/escience.pdf},
title = {A Geographical Approach for Metadata Quality Improvement in Biological Observation Databases},
year = {2013}
}
This paper addresses the problem of improving the quality of metadata in biological observation databases, in particular those associated with observations of living beings, and which are often used as a starting point for biodiversity analyses. Poor quality metadata lead to incorrect scientific conclusions, and can mislead experts in their analyses. Thus, it is important to design and develop methods to detect and correct metadata quality problems. This is a challenging problem because of the variety of issues concerning such metadata, e.g., misnaming of species, location uncertainty and imprecision concerning where observations were recorded. Related work is limited because it does not adequately model such issues. We propose a geographic approach based on expert-led classification of place and/or range mismatch anomalies detected by our algorithms. Our work is tested using a case study with the Fonoteca Neotropical Jacques Vielliard, one of the 10 largest animal sound collections in the world.
|
Miranda, Eduardo;
Santanchè, André
Unifying Phenotypes to Support Semantic Descriptions (conference)
6th Brazilian Conference on Ontological Research,
Belo Horizonte, Brazil,
2013.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Miranda2013b,
abstract = {In life sciences, there are several biological datasets shared through the web. All this abundance of data carries a great opportunity to explore complex relationships among the diversity of species. However, their physical format varies from independent data files to databases, which are heterogeneous in model and representation, hampering their integration. Ontologies are one of the promising choices to address this challenge. However, the existing digital phenotypic descriptions are stored in semi-structured formats, making extensive use of natural language. If on one hand, this patrimony is highly relevant, on the other hand, converting it in ontologies is not a straightforward task. The present article addresses this problem adding an intermediate step between semi-structured phenotypic descriptions and ontologies. It remodels semi-structured descriptions to a graph abstraction in which the data are linked. Graph transformations subsidize the transition from semi-structured data representation to a more formalized representation through ontologies.},
address = {Belo Horizonte, Brazil},
author = {Eduardo Miranda and André Santanchè},
date = {2013-09-22},
keyword = {Conference},
link = {http://ceur-ws.org/Vol-1041/ontobras-2013_paper50.pdf},
pages = {12},
publisher = {6th Brazilian Conference on Ontological Research},
title = {Unifying Phenotypes to Support Semantic Descriptions},
year = {2013}
}
In life sciences, there are several biological datasets shared through the web. All this abundance of data carries a great opportunity to explore complex relationships among the diversity of species. However, their physical format varies from independent data files to databases, which are heterogeneous in model and representation, hampering their integration. Ontologies are one of the promising choices to address this challenge. However, the existing digital phenotypic descriptions are stored in semi-structured formats, making extensive use of natural language. If on one hand, this patrimony is highly relevant, on the other hand, converting it in ontologies is not a straightforward task. The present article addresses this problem adding an intermediate step between semi-structured phenotypic descriptions and ontologies. It remodels semi-structured descriptions to a graph abstraction in which the data are linked. Graph transformations subsidize the transition from semi-structured data representation to a more formalized representation through ontologies.
|
Koga, Ivo Kenji
An Event-Based Approach to Process Environmental Data (phdthesis)
Instituto de Computação - Unicamp,
phdthesis,
2013.
(
Links |
BibTeX |
Tags:
PhDThesis
)
@phdthesis{Koga2013,
author = {Ivo Kenji Koga},
date = {2013-09-01},
keyword = {PhDThesis},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/2013-10-13-TESE-IvoKoga-v2.4.pdf},
school = {Instituto de Computação - Unicamp},
title = {An Event-Based Approach to Process Environmental Data},
year = {2013}
}
|
Jensen, R.;
Cruz, M.;
Gomes-Jr, L.;
Lopes, M.
Attributing fuzzy values to nursing diagnoses and their elements: the specialists opinion. (article)
International Journal of Nursing Knowledge,
2013.
(
Links |
BibTeX |
Tags:
Journal Paper
)
@article{Jensen2013,
author = {R. Jensen and M. Cruz and L. Gomes-Jr and M. Lopes},
date = {2013-07-01},
journal = {International Journal of Nursing Knowledge},
keyword = {Journal Paper},
link = {http://onlinelibrary.wiley.com/doi/10.1111/j.2047-3095.2013.01242.x/abstract;jsessionid=07EC6F2CE5BB23E7FD42179189538132.d02t04},
title = {Attributing fuzzy values to nursing diagnoses and their elements: the specialists opinion.},
year = {2013}
}
|
Jensen, L. Gomes-Jr, R.;
Santanche, A.
Query-based inferences in the Complex Data Management System (conference)
SLG/ICML 2013,
2013.
(
Links |
BibTeX |
Tags:
Conference
)
@conference{Gomes-Jr2013b,
author = {L. Gomes-Jr, R. Jensen and A. Santanche},
booktitle = {SLG/ICML 2013},
date = {2013-07-01},
keyword = {Conference},
link = {http://www.ic.unicamp.br/~ra041475/docs/gomes-jr_et_al-slg-2013.pdf},
title = {Query-based inferences in the Complex Data Management System},
year = {2013}
}
|
Gomes, Alessandra da Silva
Web metalaboratory (mastersthesis)
Instituto de Computação - Universidade Estadual de Campinas (UNICAMP),
Campinas - SP,
mastersthesis,
2013.
(
Abstract |
Links |
BibTeX |
Tags:
Distance education, Educational technology, Environmental laboratories, Web laboratories
)
@mastersthesis{Gomes2013,
abstract = {The amount of scientific data, services and on-line tools available on the Web offer an unprecedented opportunity to conceive new kinds of laboratories blending resources. Existing experimental and collected data can substantiate asynchronous laboratories. Combined with mashup enabled software, it is possible to produce hybrid laboratories to confront, for example, synthetic simulations with observations. This work addresses this opportunity in the Education context through our metalaboratory, an authoring environment to produce laboratories by combining building blocks encapsulated in components. We introduce here the lab composition patterns and the active Web templates as fundamental mechanisms to support a lab authoring task. These laboratories can be embedded and mashed-up in Web documents. This work shows practical experiments of producing Web virtual and hybrid laboratories.},
address = {Campinas - SP},
author = {Alessandra da Silva Gomes},
date = {2013-06-28},
keyword = {Distance education, Educational technology, Environmental laboratories, Web laboratories},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/09/GomesAlessandradaSilva_M.pdf},
school = {Instituto de Computação - Universidade Estadual de Campinas (UNICAMP)},
title = {Web metalaboratory},
year = {2013}
}
The amount of scientific data, services and on-line tools available on the Web offer an unprecedented opportunity to conceive new kinds of laboratories blending resources. Existing experimental and collected data can substantiate asynchronous laboratories. Combined with mashup enabled software, it is possible to produce hybrid laboratories to confront, for example, synthetic simulations with observations. This work addresses this opportunity in the Education context through our metalaboratory, an authoring environment to produce laboratories by combining building blocks encapsulated in components. We introduce here the lab composition patterns and the active Web templates as fundamental mechanisms to support a lab authoring task. These laboratories can be embedded and mashed-up in Web documents. This work shows practical experiments of producing Web virtual and hybrid laboratories.
|
Silva, Felipe Henriques da
Serial Annotator : managing annotations of time series (mastersthesis)
Universidade Estadual de Campinas,
mastersthesis,
2013.
(
Abstract |
Links |
BibTeX |
Tags:
time series
)
@mastersthesis{felipesilva,
abstract = {Time series are sequences of values measured at successive time instants. They are used in several domains such as agriculture, medicine and economics. The analysis of these series is of utmost importance, providing experts the ability to identify trends and forecast possible scenarios. In order to facilitate their analyses, experts often associate annotations with time series. Such annotations can also be used to correlate distinct series, or look for specific series in a database. There are many challenges involved in managing annotations - from finding proper structures to associate them with series, to organizing and retrieving series based on annotations. This work contributes to the work in management of time series. Its main contributions are the design and development of a framework for the management of multiple annotations associated with one or multiple time series in a database. The framework also provides means for annotation versioning, so that previous states of an annotation are never lost. Serial Annotator is an application implemented for the Android smart phone platform. It has been used to validate the proposed framework and has been tested with real data involving agriculture problems.},
author = {Felipe Henriques da Silva},
date = {2013-06-10},
keyword = {time series},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/08/SilvaFelipeHenriquesda_M.pdf},
school = {Universidade Estadual de Campinas},
title = {Serial Annotator : managing annotations of time series},
year = {2013}
}
Time series are sequences of values measured at successive time instants. They are used in several domains such as agriculture, medicine and economics. The analysis of these series is of utmost importance, providing experts the ability to identify trends and forecast possible scenarios. In order to facilitate their analyses, experts often associate annotations with time series. Such annotations can also be used to correlate distinct series, or look for specific series in a database. There are many challenges involved in managing annotations - from finding proper structures to associate them with series, to organizing and retrieving series based on annotations. This work contributes to the work in management of time series. Its main contributions are the design and development of a framework for the management of multiple annotations associated with one or multiple time series in a database. The framework also provides means for annotation versioning, so that previous states of an annotation are never lost. Serial Annotator is an application implemented for the Android smart phone platform. It has been used to validate the proposed framework and has been tested with real data involving agriculture problems.
|
Malaverri, Joana Esther Gonzales
Supporting data quality assessment in eScience: a provenance based approach (phdthesis)
Instituto de Computação - Unicamp,
phdthesis,
2013.
(
Abstract |
Links |
BibTeX |
Tags:
PhDThesis
)
@phdthesis{Malaverri2013,
abstract = {Data quality is a recurrent concern in all scientific domains. Experiments analyze and manipulate several kinds of datasets, and generate data to be (re)used by other experiments. The basis for obtaining good scientific results is highly associated with the degree of quality of such datasets. However, data involved with the experiments are manipulated by a wide range of users, with distinct research interests, using their own vocabularies, work methodologies, models, and sampling needs. Given this scenario, a challenge in computer science is to come up with solutions that help scientists to assess the quality of their data. Different efforts have been proposed addressing the estimation of quality. Some of these efforts outline that data provenance attributes should be used to evaluate quality. However, most of these initiatives address the evaluation of a specific quality attribute, frequently focusing on atomic data values, thereby reducing the applicability of these approaches. Taking this scenario into account, there is a need for new solutions that scientists can adopt to assess how good their data are. In this PhD research, we present an approach to attack this problem based on the notion of data provenance. Unlike other similar approaches, our proposal combines quality attributes specified within a context by pecialists and metadata on the provenance of a data set. The main contributions of this work are: (i) the specification of a framework that takes advantage of data provenance to derive quality information; (ii) a methodology associated with this framework that outlines the procedures to support the assessment of quality; (iii) the proposal of two different provenance models to capture provenance information, for fixed and extensible scenarios; and (iv) validation of items (i) through (iii), with their discussion via case studies in agriculture and biodiversity.},
author = {Joana Esther Gonzales Malaverri},
date = {2013-05-01},
keyword = {PhDThesis},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/thesisJoana.pdf},
school = {Instituto de Computação - Unicamp},
title = {Supporting data quality assessment in eScience: a provenance based approach},
year = {2013}
}
Data quality is a recurrent concern in all scientific domains. Experiments analyze and manipulate several kinds of datasets, and generate data to be (re)used by other experiments. The basis for obtaining good scientific results is highly associated with the degree of quality of such datasets. However, data involved with the experiments are manipulated by a wide range of users, with distinct research interests, using their own vocabularies, work methodologies, models, and sampling needs. Given this scenario, a challenge in computer science is to come up with solutions that help scientists to assess the quality of their data. Different efforts have been proposed addressing the estimation of quality. Some of these efforts outline that data provenance attributes should be used to evaluate quality. However, most of these initiatives address the evaluation of a specific quality attribute, frequently focusing on atomic data values, thereby reducing the applicability of these approaches. Taking this scenario into account, there is a need for new solutions that scientists can adopt to assess how good their data are. In this PhD research, we present an approach to attack this problem based on the notion of data provenance. Unlike other similar approaches, our proposal combines quality attributes specified within a context by pecialists and metadata on the provenance of a data set. The main contributions of this work are: (i) the specification of a framework that takes advantage of data provenance to derive quality information; (ii) a methodology associated with this framework that outlines the procedures to support the assessment of quality; (iii) the proposal of two different provenance models to capture provenance information, for fixed and extensible scenarios; and (iv) validation of items (i) through (iii), with their discussion via case studies in agriculture and biodiversity.
|
Longo, João Sávio Ceregatti
Management of integrity constraints for multi-scale geospatial data (mastersthesis)
Instituto de Computação - Unicamp,
mastersthesis,
2013.
(
Links |
BibTeX |
Tags:
Mastersthesis
)
@mastersthesis{Longo2013,
author = {João Sávio Ceregatti Longo},
date = {2013-03-01},
keyword = {Mastersthesis},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/master_thesis_final.pdf},
school = {Instituto de Computação - Unicamp},
title = {Management of integrity constraints for multi-scale geospatial data},
year = {2013}
}
|
Gomes-Jr, Luiz;
Jensen, Rodrigo;
Santanchè, André
Towards query model integration: topology-aware, IR-inspired metrics for declarative graph querying (conference)
Second International Workshop on Querying Graph Structured Data,
2013.
(
Abstract |
BibTeX |
Tags:
Conference
)
@conference{Gomes-Jr2013,
abstract = {Accompanying the growth of the internet and the consequent diversification of applications and data processing needs, there has been a rapid proliferation of data and query models. While graph models such as RDF have been successfully used to integrate data from diverse origins, interaction with the integrated data is still limited by inflexible query models that cannot express concepts from multiple paradigms. In this paper we analyze data and query models typical of modern data-driven applications. We then propose an integrated query model aimed at covering a broad range of applications, allowing expressive queries that capture elements from diverse data models and querying paradigms. We employ graphs models to integrate data from structured and unstructured sources. We also reinterpret as graph analysis tasks several ranking metrics typical of information retrieval (IR) systems. The metrics allow flexible correlation of data elements based on topological properties of the underlying graph. The new query model is materialized in a query language named in* (in star). We present experiments with real data that demonstrate the expressiveness and practicability of our approach.},
author = {Luiz Gomes-Jr and Rodrigo Jensen and André Santanchè},
booktitle = {Second International Workshop on Querying Graph Structured Data},
date = {2013-03-01},
keyword = {Conference},
title = {Towards query model integration: topology-aware, IR-inspired metrics for declarative graph querying},
year = {2013}
}
Accompanying the growth of the internet and the consequent diversification of applications and data processing needs, there has been a rapid proliferation of data and query models. While graph models such as RDF have been successfully used to integrate data from diverse origins, interaction with the integrated data is still limited by inflexible query models that cannot express concepts from multiple paradigms. In this paper we analyze data and query models typical of modern data-driven applications. We then propose an integrated query model aimed at covering a broad range of applications, allowing expressive queries that capture elements from diverse data models and querying paradigms. We employ graphs models to integrate data from structured and unstructured sources. We also reinterpret as graph analysis tasks several ranking metrics typical of information retrieval (IR) systems. The metrics allow flexible correlation of data elements based on topological properties of the underlying graph. The new query model is materialized in a query language named in* (in star). We present experiments with real data that demonstrate the expressiveness and practicability of our approach.
|
Vilar, Bruno S. C. M.;
Medeiros, Claudia Bauzer;
Santanchè, André
Towards Adapting Scientific Workflow Systems to Healthcare Planning (conference)
HEALTHINF - International Conference on Health Informatics,
2013.
(
Links |
BibTeX |
Tags:
Conference
)
@conference{Vilar2013,
author = {Bruno S. C. M. Vilar and Claudia Bauzer Medeiros and André Santanchè},
booktitle = {HEALTHINF - International Conference on Health Informatics},
date = {2013-01-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/HealthInf-Biostec-2013.pdf},
title = {Towards Adapting Scientific Workflow Systems to Healthcare Planning},
year = {2013}
}
|
Malaverri, J. E. G.;
Mota, M. S.;
Medeiros, C. B.
Estimating the quality of data using provenance: a case study in eScience (conference)
19th Americas Conference on Information Systems (AMCIS),
2013.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Malaverri2013c,
abstract = {Data quality assessment is a key factor in data-intensive domains. The data deluge is aggravated by an increasing need for interoperability and cooperation across groups and organizations. New alternatives must be found to select the data that best satisfy users’ needs in a given context. This paper presents a strategy to provide information to support the evaluation of the quality of data sets. This strategy is based on combining metadata on the provenance of a data set (derived from workflows that generate it) and quality dimensions defined by the set’s users, based on the desired context of use. Our solution, validated via a case study, takes advantage of a semantic model to preserve data provenance related to applications in a specific domain.},
author = {J. E. G. Malaverri and M. S. Mota and C. B. Medeiros},
booktitle = {19th Americas Conference on Information Systems (AMCIS)},
date = {2013-01-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/AMCIS2013_Paper_ProvToQual.pdf},
title = {Estimating the quality of data using provenance: a case study in eScience},
year = {2013}
}
Data quality assessment is a key factor in data-intensive domains. The data deluge is aggravated by an increasing need for interoperability and cooperation across groups and organizations. New alternatives must be found to select the data that best satisfy users’ needs in a given context. This paper presents a strategy to provide information to support the evaluation of the quality of data sets. This strategy is based on combining metadata on the provenance of a data set (derived from workflows that generate it) and quality dimensions defined by the set’s users, based on the desired context of use. Our solution, validated via a case study, takes advantage of a semantic model to preserve data provenance related to applications in a specific domain.
|
Malaverri, Joana E. Gonzales;
Santanchè, André;
Medeiros, Claudia Bauzer
A Provenance-based Approach to Evaluate Data Quality in eScience (article)
Int. J. Metadata, Semantics and Ontologies,
2013.
(
Abstract |
BibTeX |
Tags:
Article
)
@article{Malaverri2013,
abstract = {Data quality is growing in relevance as a research topic. This is becoming increasingly crucial in data-intensive domains, e.g., stock market and financial studies, eHealth, or environmental research. Indeed, the data deluge characteristic of eScience applications has brought about new concerns along this direction. Quality assessment methods and models have been progressively incorporated in many business environments, as well as in software engineering practices. eScience environments, however, because of the many data source providers, kinds of scientific expertise needed, and multiple timeand- space scales involved in a given problem make it diffcult to assess data quality. This paper is concerned with the evaluation of the quality of data managed by eScience applications. Our approach is based on data provenance, i.e. the history of the origins and transformation processes applied to a given data product. Our contributions include: (i) the specification of a framework to track data provenance and use this information to derive quality information; (ii) a model for data provenance based on the Open Provenance Model; and (iii) a methodology to evaluate the quality of some digital artifact based on its provenance. Our proposal is validated experimentally by a prototype we developed that takes advantage of the Taverna work ow system.},
author = {Joana E. Gonzales Malaverri and André Santanchè and Claudia Bauzer Medeiros},
date = {2013-01-01},
journal = {Int. J. Metadata, Semantics and Ontologies},
keyword = {Article},
title = {A Provenance-based Approach to Evaluate Data Quality in eScience},
year = {2013}
}
Data quality is growing in relevance as a research topic. This is becoming increasingly crucial in data-intensive domains, e.g., stock market and financial studies, eHealth, or environmental research. Indeed, the data deluge characteristic of eScience applications has brought about new concerns along this direction. Quality assessment methods and models have been progressively incorporated in many business environments, as well as in software engineering practices. eScience environments, however, because of the many data source providers, kinds of scientific expertise needed, and multiple timeand- space scales involved in a given problem make it diffcult to assess data quality. This paper is concerned with the evaluation of the quality of data managed by eScience applications. Our approach is based on data provenance, i.e. the history of the origins and transformation processes applied to a given data product. Our contributions include: (i) the specification of a framework to track data provenance and use this information to derive quality information; (ii) a model for data provenance based on the Open Provenance Model; and (iii) a methodology to evaluate the quality of some digital artifact based on its provenance. Our proposal is validated experimentally by a prototype we developed that takes advantage of the Taverna work ow system.
|
Longo, João Sávio Ceregatti;
Medeiros, Claudia Bauzer
Providing multi-scale consistency for multi-scale geospatial data (conference)
25th International Conference on Scientific and Statistical Database Management (SSDBM),
2013.
(
BibTeX |
Tags:
Conference
)
@conference{Longo2013b,
author = {João Sávio Ceregatti Longo and Claudia Bauzer Medeiros},
booktitle = {25th International Conference on Scientific and Statistical Database Management (SSDBM)},
date = {2013-01-01},
keyword = {Conference},
note = {Accepted},
pages = {12},
title = {Providing multi-scale consistency for multi-scale geospatial data},
year = {2013}
}
|
R., BERNARDO, I.;
S., MOTA, M.;
A., SANTANCHE,
Extracting and Semantically Integrating Implicit Schemas from Multiple Spreadsheets of Biology based on the Recognition of their Nature (article)
Journal of Information and Data Management - JIDM,
2,
2013.
(
Abstract |
Links |
BibTeX |
Tags:
Article
)
@article{BERNARDO2013,
abstract = {Spreadsheets are popular among users and organizations, becoming an essential data management tool. The easiness to handle spreadsheets associated with the creative freedom resulted in an increase in the volume of data available in this format. However, spreadsheets are not conceived to integrate data from distinct sources and challenges arise involving systematization of processes to reuse and combine their data. Many related initiatives address the problem of integrating data inside spreadsheets, focusing on lexical and syntactical aspects. However, the proper exploitation of the semantics related to this data is still an opportunity. In this sense, some related work propose mapping spreadsheets contents to open interoperability standards, mainly Semantic Web standards. The main limitation of such proposals is the assumption that it is possible to recognize and make explicit the schema and the semantics of spreadsheets automatically, regardless of their domain. This work differs from related work by assuming the essential role of the context – mainly the domain in which the spreadsheet was conceived – to delineate shared practices of the biology community, which establishes building patterns to be automatically recognized by our system, in a data extraction process and schema recognition. In this article, we present the result of a practical experiment involving such a system, in which we integrate hundreds of spreadsheets belonging to the biology domain and available on the Web. This integration was possible due to observation that the recognition of a spreadsheet nature can be achieved from its tabular organization.},
author = {BERNARDO, I. R. and MOTA, M. S. and SANTANCHE, A.},
date = {2013-01-01},
journal = {Journal of Information and Data Management - JIDM},
keyword = {Article},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/220-1058-1-PB.pdf},
number = {2},
pages = {104-114},
title = {Extracting and Semantically Integrating Implicit Schemas from Multiple Spreadsheets of Biology based on the Recognition of their Nature},
volume = {4},
year = {2013}
}
Spreadsheets are popular among users and organizations, becoming an essential data management tool. The easiness to handle spreadsheets associated with the creative freedom resulted in an increase in the volume of data available in this format. However, spreadsheets are not conceived to integrate data from distinct sources and challenges arise involving systematization of processes to reuse and combine their data. Many related initiatives address the problem of integrating data inside spreadsheets, focusing on lexical and syntactical aspects. However, the proper exploitation of the semantics related to this data is still an opportunity. In this sense, some related work propose mapping spreadsheets contents to open interoperability standards, mainly Semantic Web standards. The main limitation of such proposals is the assumption that it is possible to recognize and make explicit the schema and the semantics of spreadsheets automatically, regardless of their domain. This work differs from related work by assuming the essential role of the context – mainly the domain in which the spreadsheet was conceived – to delineate shared practices of the biology community, which establishes building patterns to be automatically recognized by our system, in a data extraction process and schema recognition. In this article, we present the result of a practical experiment involving such a system, in which we integrate hundreds of spreadsheets belonging to the biology domain and available on the Web. This integration was possible due to observation that the recognition of a spreadsheet nature can be achieved from its tabular organization.
|
2012 |
Senra, Rodrigo Dias de Arruda
Organization is sharing : from eScience to personal information management (phdthesis)
Institute of Computing, UNICAMP,
phdthesis,
2012.
(
Abstract |
Links |
BibTeX |
Tags:
Personal Information Management
)
@phdthesis{senra2012,
abstract = {Information sharing has always been a key issue in any kind of joint effort. Paradoxically, with the data deluge, the more information available, the harder it is to design and implement solutions that effectively foster such sharing. This thesis analyzes distinct aspects of sharing - from eScience-related environments to personal information. As a result of this analysis, it provides answers to some of the problems encountered, along three axes. The first, SciFrame, is a specific framework that describes systems or processes involving scientific digital data manipulation, serving as a descriptive pattern to help system comparison. The adoption of SciFrame to describe distinct scientific virtual environments allows identifying commonalities and points for interoperation. The second axe contribution addresses the specific problem of communication between arbitrary systems and services provided by distinct database platforms, via the use of the so-called database descriptors or DBDs. These descriptors contribute to provide independence between applications and the services, thereby enhancing sharing across applications and databases. The third contribution, Organographs, provides means to deal with multifaceted information organization. It addresses problems of sharing personal information by means of exploiting the way we organize such information. Here, rather than trying to provide means to share the information itself, the unit of sharing is the organization of the information. By designing and sharing organographs, distinct groups provide each other dynamic, reconfigurable views of how information is organized, thereby promoting interoperability and reuse. Organographs are an innovative approach to hierarchical data management. These three contributions are centered on the basic idea of building and sharing hierarchical organizations. Part of these contributions was validated by case studies and, in the case of organographs, an actual implementation.},
author = {Rodrigo Dias de Arruda Senra},
date = {2012-12-10},
keyword = {Personal Information Management},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/04/SenraRodrigoDiasArruda_D3.pdf},
note = {Supervisor Claudia Bauzer Medeiros},
school = {Institute of Computing, UNICAMP},
title = {Organization is sharing : from eScience to personal information management},
year = {2012}
}
Information sharing has always been a key issue in any kind of joint effort. Paradoxically, with the data deluge, the more information available, the harder it is to design and implement solutions that effectively foster such sharing. This thesis analyzes distinct aspects of sharing - from eScience-related environments to personal information. As a result of this analysis, it provides answers to some of the problems encountered, along three axes. The first, SciFrame, is a specific framework that describes systems or processes involving scientific digital data manipulation, serving as a descriptive pattern to help system comparison. The adoption of SciFrame to describe distinct scientific virtual environments allows identifying commonalities and points for interoperation. The second axe contribution addresses the specific problem of communication between arbitrary systems and services provided by distinct database platforms, via the use of the so-called database descriptors or DBDs. These descriptors contribute to provide independence between applications and the services, thereby enhancing sharing across applications and databases. The third contribution, Organographs, provides means to deal with multifaceted information organization. It addresses problems of sharing personal information by means of exploiting the way we organize such information. Here, rather than trying to provide means to share the information itself, the unit of sharing is the organization of the information. By designing and sharing organographs, distinct groups provide each other dynamic, reconfigurable views of how information is organized, thereby promoting interoperability and reuse. Organographs are an innovative approach to hierarchical data management. These three contributions are centered on the basic idea of building and sharing hierarchical organizations. Part of these contributions was validated by case studies and, in the case of organographs, an actual implementation.
|
Santanchè, André;
Medeiros, Claudia Bauzer;
Jomier, Genevieve;
Zam, Michel
Challenges of the Anthropocene epoch - supporting multi-focus research (conference)
Proceeding of XIII Brazilian Symposium on Geoinformatics - GeoInfo,
2012.
(
Links |
BibTeX |
Tags:
Conference
)
@conference{Santanche2012,
author = {André Santanchè and Claudia Bauzer Medeiros and Genevieve Jomier and Michel Zam},
booktitle = {Proceeding of XIII Brazilian Symposium on Geoinformatics - GeoInfo},
date = {2012-11-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/santancheetal2012-v02.pdf},
pages = {1-10},
title = {Challenges of the Anthropocene epoch - supporting multi-focus research},
year = {2012}
}
|
G., Malaverri Joana E.;
B., Medeiros Claudia
Data Quality in Agriculture Applications (conference)
XIII Brazilian Symposium on GeoInformatics - GeoInfo,
2012.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Malaverri2012,
abstract = {Data quality is a common concern in a wide range of domains. Since agriculture plays an important role in the Brazilian economy, it is crucial that the data be useful and with a proper level of quality for the decision making process, planning activities, among others. Nevertheless, this requirement is not often taken into account when different systems and databases are modeled. This work presents a review about data quality issues covering some efforts in agriculture and geospatial science to tackle these issues. The goal is to help researchers and practitioners to design better applications. In particular, we focus on the different dimensions of quality and the approaches that are used to measure them.},
author = {Malaverri Joana E. G. and Medeiros Claudia B.},
booktitle = {XIII Brazilian Symposium on GeoInformatics - GeoInfo},
date = {2012-11-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/geoinfoJoana2012.pdf},
title = {Data Quality in Agriculture Applications},
year = {2012}
}
Data quality is a common concern in a wide range of domains. Since agriculture plays an important role in the Brazilian economy, it is crucial that the data be useful and with a proper level of quality for the decision making process, planning activities, among others. Nevertheless, this requirement is not often taken into account when different systems and databases are modeled. This work presents a review about data quality issues covering some efforts in agriculture and geospatial science to tackle these issues. The goal is to help researchers and practitioners to design better applications. In particular, we focus on the different dimensions of quality and the approaches that are used to measure them.
|
Bernardo, Ivelize Rocha;
Mota, Matheus Silva;
Santanchè, André
Extraindo e Integrando Semanticamente Dados de Múltiplas Planilhas Eletrônicas a Partir do Reconhecimento de Sua Natureza (conference)
Simpósio Brasileiro de Banco de Dados (SBBD),
2012.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Bernardo2012b,
abstract = {Spreadsheets are popular among users and organizations, becoming an essential data management tool. The ease of accessing associated with the creative freedom o?ered by spreadsheets resulted in the increase of the data volume available in this format. However, spreadsheets are not conceived for integration of data from distinct sources and challenges arise involving systematization of processes to reuse and combine their data. Many related initiatives address integration of data inside spreadsheets focusing on lexical and syntactical aspects, however, the exploration of the semantics related to these data is still an open challenge. In this sense, some related work propose mapping spreadsheets contents to open interoperability standards, mainly Semantic Web standards. The main limitation of such proposals is the assumption that it is possible to recognize and make explicit the schema and the semantics of spreadsheets automatically regardless of their domain. This work di?ers from related work by assuming the essential role of the context ? mainly the domain in which the spreadsheet was conceived ? to delineate shared practices of the community, which establishes building standards to be automatically recognized by our system, in a data extraction process and schema recognition. In this paper we present a result of a practical experiment involving such a system, in which we integrated data from hundreds of spreadsheets available on the Web. This integration was possible due to a unique ability of our approach of recognizing the spreadsheet nature, analyzed inside its creation context.},
author = {Ivelize Rocha Bernardo and Matheus Silva Mota and André Santanchè},
booktitle = {Simpósio Brasileiro de Banco de Dados (SBBD)},
date = {2012-10-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/SBBD2012.pdf},
title = {Extraindo e Integrando Semanticamente Dados de Múltiplas Planilhas Eletrônicas a Partir do Reconhecimento de Sua Natureza},
year = {2012}
}
Spreadsheets are popular among users and organizations, becoming an essential data management tool. The ease of accessing associated with the creative freedom o?ered by spreadsheets resulted in the increase of the data volume available in this format. However, spreadsheets are not conceived for integration of data from distinct sources and challenges arise involving systematization of processes to reuse and combine their data. Many related initiatives address integration of data inside spreadsheets focusing on lexical and syntactical aspects, however, the exploration of the semantics related to these data is still an open challenge. In this sense, some related work propose mapping spreadsheets contents to open interoperability standards, mainly Semantic Web standards. The main limitation of such proposals is the assumption that it is possible to recognize and make explicit the schema and the semantics of spreadsheets automatically regardless of their domain. This work di?ers from related work by assuming the essential role of the context ? mainly the domain in which the spreadsheet was conceived ? to delineate shared practices of the community, which establishes building standards to be automatically recognized by our system, in a data extraction process and schema recognition. In this paper we present a result of a practical experiment involving such a system, in which we integrated data from hundreds of spreadsheets available on the Web. This integration was possible due to a unique ability of our approach of recognizing the spreadsheet nature, analyzed inside its creation context.
|
Bernardo, Ivelize Rocha
Planilhas eletrônicas , Web semântica , Recuperação da informação , Biologia - Processamento de dados (mastersthesis)
Instituto de Computação - Universidade Estadual de Campinas (UNICAMP),
Campinas - SP,
mastersthesis,
2012.
(
Abstract |
Links |
BibTeX |
Tags:
Biologia - Processamento de dados, Planilhas eletrônicas, Recuperação da informação, Web semântica
)
@mastersthesis{bernardo2012b,
abstract = {A flexibilidade proporcionada por planilhas eletrônicas possibilita sua customização seguindo modelos mentais de seus autores e as tornam sistemas populares de gerenciamento de dados. Gradativamente tem crescido a necessidade de se integrar e articular dados de diferentes planilhas e, para que máquinas possam auxiliar neste processo, o desafio é como interpretar automaticamente o seu esquema implícito, que é dirigido à interpretação humana. Alguns trabalhos propõem o mapeamento do conteúdo das planilhas para padrões abertos de interoperabilidade, principalmente aqueles da Web Semântica. A principal limitação destes trabalhos consiste no pressuposto de que é possível reconhecer e explicitar os esquemas e a semântica das planilhas automaticamente, independentemente do seu domínio. Este trabalho se diferencia por considerar o contexto e o domínio em que foram concebidas as planilhas essenciais para se traçar o conjunto de práticas compartilhadas pela comunidade em questão, que estabelece padrões de construção a serem reconhecidos automaticamente por nosso sistema, em um processo de extração de dados e explicitação de esquemas. Nossa proposta envolve uma estratégia para caracterização de padrões de construção associados a modelos conceituais de autores na construção de planilhas, que é resultado de uma ampla pesquisa de práticas compartilhadas por autores de planilhas no domínio de uso da Biologia. Neste documento apresentamos o resultado de um experimento prático envolvendo tal sistema, no qual integramos os dados de centenas de planilhas eletrônicas disponíveis na Web. Tal integração foi possível pela capacidade única de nossa abordagem de reconhecer a natureza da planilha analisada dentro de seu contexto de criação.},
address = {Campinas - SP},
author = {Ivelize Rocha Bernardo},
date = {2012-09-04},
keyword = {Biologia - Processamento de dados, Planilhas eletrônicas, Recuperação da informação, Web semântica},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2015/09/BernardoIvelizeRocha_M.pdf},
school = {Instituto de Computação - Universidade Estadual de Campinas (UNICAMP)},
title = {Planilhas eletrônicas , Web semântica , Recuperação da informação , Biologia - Processamento de dados},
year = {2012}
}
A flexibilidade proporcionada por planilhas eletrônicas possibilita sua customização seguindo modelos mentais de seus autores e as tornam sistemas populares de gerenciamento de dados. Gradativamente tem crescido a necessidade de se integrar e articular dados de diferentes planilhas e, para que máquinas possam auxiliar neste processo, o desafio é como interpretar automaticamente o seu esquema implícito, que é dirigido à interpretação humana. Alguns trabalhos propõem o mapeamento do conteúdo das planilhas para padrões abertos de interoperabilidade, principalmente aqueles da Web Semântica. A principal limitação destes trabalhos consiste no pressuposto de que é possível reconhecer e explicitar os esquemas e a semântica das planilhas automaticamente, independentemente do seu domínio. Este trabalho se diferencia por considerar o contexto e o domínio em que foram concebidas as planilhas essenciais para se traçar o conjunto de práticas compartilhadas pela comunidade em questão, que estabelece padrões de construção a serem reconhecidos automaticamente por nosso sistema, em um processo de extração de dados e explicitação de esquemas. Nossa proposta envolve uma estratégia para caracterização de padrões de construção associados a modelos conceituais de autores na construção de planilhas, que é resultado de uma ampla pesquisa de práticas compartilhadas por autores de planilhas no domínio de uso da Biologia. Neste documento apresentamos o resultado de um experimento prático envolvendo tal sistema, no qual integramos os dados de centenas de planilhas eletrônicas disponíveis na Web. Tal integração foi possível pela capacidade única de nossa abordagem de reconhecer a natureza da planilha analisada dentro de seu contexto de criação.
|
Koga, Ivo;
Medeiros, Claudia Bauzer
Integrating and processing events from Heterogeneous Data Sources (conference)
Proceedings VI eScience Workshop - XXXII Brazilian Computer Society Conference,
2012.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Koga2012,
abstract = {Environmental monitoring studies present many challenges. A huge amount of data are provided in different formats from different sources (e.g. sensor networks and databases). This paper presents a framework we have developed to overcome some of these problems, based on combining aspects of Enterprise Service Bus (ESB) architectures and Event Processing mechanisms. First, we treat integration using ESB and then use event processing to transform, filter and detect event patterns, where all data arriving at a given point are treated uniformly as event streams. A case study concerning data streams of meteorological stations is provided to show the feasibility of this solution.},
author = {Ivo Koga and Claudia Bauzer Medeiros},
booktitle = {Proceedings VI eScience Workshop - XXXII Brazilian Computer Society Conference},
date = {2012-07-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/CSBC-Workshop-eScience-Ivo-2012-06-06.pdf},
title = {Integrating and processing events from Heterogeneous Data Sources},
year = {2012}
}
Environmental monitoring studies present many challenges. A huge amount of data are provided in different formats from different sources (e.g. sensor networks and databases). This paper presents a framework we have developed to overcome some of these problems, based on combining aspects of Enterprise Service Bus (ESB) architectures and Event Processing mechanisms. First, we treat integration using ESB and then use event processing to transform, filter and detect event patterns, where all data arriving at a given point are treated uniformly as event streams. A case study concerning data streams of meteorological stations is provided to show the feasibility of this solution.
|
Cugler, Daniel Cintra;
Medeiros, Claudia Bauzer;
Toledo, Felipe
An architecture for retrieval of animal sound recordings based on context variables (article)
Concurrency and Computation - Practice and Experience,
2012.
(
Abstract |
BibTeX |
Tags:
Article
)
@article{Cugler2012,
abstract = {For decades, biologists around the world have recorded animal sounds. As the number of records grows, so does the difficulty to manage them, presenting challenges to save, retrieve, share and manage sounds. These challenges are complicated by the fact that animal sound recordings have specific peculiarities, associated to the context in which the sound was recorded. For example, sounds emitted by individuals that are in groups may be different from ones emitted by isolated individuals. Though these characteristics may be relevant to biologists, they are seldom explicit in the recording metadata. This paper discusses our ongoing research on management of sound recordings, considering factors such as environmental or social contexts, which are not treated by current systems. This work exploits retrieval based on context analysis. Query parameters include context variables that are dynamically derived using public services and ontologies associated with sound recording metadata. Part of the results have been validated through a web prototype, discussed in the text.},
author = {Daniel Cintra Cugler and Claudia Bauzer Medeiros and Felipe Toledo},
date = {2012-06-01},
journal = {Concurrency and Computation - Practice and Experience},
keyword = {Article},
title = {An architecture for retrieval of animal sound recordings based on context variables},
year = {2012}
}
For decades, biologists around the world have recorded animal sounds. As the number of records grows, so does the difficulty to manage them, presenting challenges to save, retrieve, share and manage sounds. These challenges are complicated by the fact that animal sound recordings have specific peculiarities, associated to the context in which the sound was recorded. For example, sounds emitted by individuals that are in groups may be different from ones emitted by isolated individuals. Though these characteristics may be relevant to biologists, they are seldom explicit in the recording metadata. This paper discusses our ongoing research on management of sound recordings, considering factors such as environmental or social contexts, which are not treated by current systems. This work exploits retrieval based on context analysis. Query parameters include context variables that are dynamically derived using public services and ontologies associated with sound recording metadata. Part of the results have been validated through a web prototype, discussed in the text.
|
Fedel, Gabriel de S.;
Medeiros, Claudia Bauzer;
Santos, Jefersson Alex dos
Sinimbu - Multimodal queries to support biodiversity studies (conference)
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS – ICCSA 2012,
LNCS,
2012.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{deFedel2012,
abstract = {Typical biodiversity information systems can only solve a small part of user concerns. Available query mechanisms are based on traditional textual database manipulations, combmining them with spatial correlations. However, experts need more complex computations – e.g., using non-textual data sources. This involves a considerable amount of manual tasks, to obtain the needed information. This paper presents the specification and implementation of Sinimbu – a framework to process multimodal queries that support both text and images as search parameters, for biodiversity studies, thus providing support for subsequent complex simulations. Sinimbu was validated with real data from our university’s Zoology Museum, which houses one of the largest zoological museum collections in Brazil. Not only can users interact with the system in several modes, but query possibilities (and answers) vary according to the user’s profile. Query processing in Sinimbu combines work in database management, image processing and ontology construction and management.},
author = {Gabriel de S. Fedel and Claudia Bauzer Medeiros and Jefersson Alex dos Santos},
booktitle = {COMPUTATIONAL SCIENCE AND ITS APPLICATIONS – ICCSA 2012},
date = {2012-05-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/fedel_ICCSA2012.pdf},
pages = {620-634},
publisher = {LNCS},
title = {Sinimbu - Multimodal queries to support biodiversity studies},
volume = {7333/2012},
year = {2012}
}
Typical biodiversity information systems can only solve a small part of user concerns. Available query mechanisms are based on traditional textual database manipulations, combmining them with spatial correlations. However, experts need more complex computations – e.g., using non-textual data sources. This involves a considerable amount of manual tasks, to obtain the needed information. This paper presents the specification and implementation of Sinimbu – a framework to process multimodal queries that support both text and images as search parameters, for biodiversity studies, thus providing support for subsequent complex simulations. Sinimbu was validated with real data from our university’s Zoology Museum, which houses one of the largest zoological museum collections in Brazil. Not only can users interact with the system in several modes, but query possibilities (and answers) vary according to the user’s profile. Query processing in Sinimbu combines work in database management, image processing and ontology construction and management.
|
Mota, Matheus Silva
Shadows: a new means of representing documents (mastersthesis)
Instituto de Computação - Unicamp,
mastersthesis,
2012.
(
Abstract |
Links |
BibTeX |
Tags:
Mastersthesis
)
@mastersthesis{Mota2012,
abstract = {Document production tools are present everywhere, resulting in an exponential growth of increasingly complex, distributed and heterogeneous documents. This hampers document exchange, as well as their annotation and retrieval. While information retrieval mechanisms concentrate on textual features (corpus analysis), annotation approaches either target specific formats or require that a document follows interoperable standards -- defined via schemas. This work presents our effort to handle these problems, providing a more flexible solution. Rather than trying to modify or convert the document itself, or to target only textual characteristics, the strategy described in this work is based on an intermediate descriptor -- the document shadow. A shadow represents domain-relevant aspects and elements of both structure and content of a given document. Shadows are not restricted to the description of textual features, but also concern other elements, such as multimedia artifacts. Furthermore, shadows can be stored in a database, thereby supporting queries on document structure and content, regardless document formats.},
author = {Matheus Silva Mota},
date = {2012-05-01},
keyword = {Mastersthesis},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/DissertationMatheus.pdf},
school = {Instituto de Computação - Unicamp},
title = {Shadows: a new means of representing documents},
year = {2012}
}
Document production tools are present everywhere, resulting in an exponential growth of increasingly complex, distributed and heterogeneous documents. This hampers document exchange, as well as their annotation and retrieval. While information retrieval mechanisms concentrate on textual features (corpus analysis), annotation approaches either target specific formats or require that a document follows interoperable standards -- defined via schemas. This work presents our effort to handle these problems, providing a more flexible solution. Rather than trying to modify or convert the document itself, or to target only textual characteristics, the strategy described in this work is based on an intermediate descriptor -- the document shadow. A shadow represents domain-relevant aspects and elements of both structure and content of a given document. Shadows are not restricted to the description of textual features, but also concern other elements, such as multimedia artifacts. Furthermore, shadows can be stored in a database, thereby supporting queries on document structure and content, regardless document formats.
|
Alves, Hugo Augusto
Ontologias Folksonomizadas - Uma Abordagem para Fusão de Ontologias e Folksonomias (mastersthesis)
Instituto de Computação - Unicamp,
mastersthesis,
2012.
(
Abstract |
Links |
BibTeX |
Tags:
Mastersthesis
)
@mastersthesis{Alves2012,
abstract = {Um número crescente de repositórios na web se baseiam em metadados na forma de rótulos (tags) para organizar e classificar o seu conteúdo. Os usuários destes sistemas associam livremente tags a recursos do sistema – e.g., URLs, imagens, marcadores. O termo folksonomia se refere a esta classificação coletiva, que emerge do processo de rotulação (tagging) realizado por usuários interagindo em ambientes sociais na web. Uma das maiores qualidades das folksonomias é a sua simplicidade de uso pela ausência de um vocabulário controlado. Folksonomias crescem de forma orgânica, refletindo o conhecimento da comunidade de usuários. Por outro lado, esta falta de estrutura leva a dificuldades em operações de organização e descoberta de conteúdo. Melhores resultados podem ser obtidos se forem consideradas as relações semânticas entre os rótulos. Por esta razão, vários trabalhos foram propostos com o objetivo de relacionar ontologias e folksonomias, combinando a estrutura sistematizada das ontologias à semântica latente das folksonomias. Enquanto em uma direção algumas abordagens criam “ontologias sociais” a partir dos dados das folksonomias, em outra direção algumas abordagens conectam rótulos a ontologias preexistentes. Em ambos os casos nota-se uma unidirecionalidade, ou seja, um modelo apenas dá suporte ao enriquecimento do outro. Nossa proposta, por outro lado, é bidirecional. Ontologias e folksonomias são fundidas em uma nova entidade, que chamamos de “ontologia folksonomizada”, combinando aspectos complementares de ambas. O conhecimento formal e projetado das ontologias é fundido com a semântica latente dos dados sociais. Nesta dissertação apresentamos nossa ontologia folksonomizada e seus desdobramentos. Nós introduzimos aqui um framework formal para a análise de trabalhos relacionados, a fim de confrontá-los com a nossa abordagem. Além das melhorias nas operações de indexação e descoberta, que foram validadas em experimentos práticos, nós propomos uma técnica chamada 3E Steps para dar suporte à evolução de ontologias usando dados de folksonomias. Nós também implementamos o protótipo de uma ferramenta para a construção de ontologias folksonomizadas e para dar suporte à revisão de ontologias.},
author = {Hugo Augusto Alves},
date = {2012-04-01},
keyword = {Mastersthesis},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/Ontologias-Folksonomizadas.pdf},
school = {Instituto de Computação - Unicamp},
title = {Ontologias Folksonomizadas - Uma Abordagem para Fusão de Ontologias e Folksonomias},
year = {2012}
}
Um número crescente de repositórios na web se baseiam em metadados na forma de rótulos (tags) para organizar e classificar o seu conteúdo. Os usuários destes sistemas associam livremente tags a recursos do sistema – e.g., URLs, imagens, marcadores. O termo folksonomia se refere a esta classificação coletiva, que emerge do processo de rotulação (tagging) realizado por usuários interagindo em ambientes sociais na web. Uma das maiores qualidades das folksonomias é a sua simplicidade de uso pela ausência de um vocabulário controlado. Folksonomias crescem de forma orgânica, refletindo o conhecimento da comunidade de usuários. Por outro lado, esta falta de estrutura leva a dificuldades em operações de organização e descoberta de conteúdo. Melhores resultados podem ser obtidos se forem consideradas as relações semânticas entre os rótulos. Por esta razão, vários trabalhos foram propostos com o objetivo de relacionar ontologias e folksonomias, combinando a estrutura sistematizada das ontologias à semântica latente das folksonomias. Enquanto em uma direção algumas abordagens criam “ontologias sociais” a partir dos dados das folksonomias, em outra direção algumas abordagens conectam rótulos a ontologias preexistentes. Em ambos os casos nota-se uma unidirecionalidade, ou seja, um modelo apenas dá suporte ao enriquecimento do outro. Nossa proposta, por outro lado, é bidirecional. Ontologias e folksonomias são fundidas em uma nova entidade, que chamamos de “ontologia folksonomizada”, combinando aspectos complementares de ambas. O conhecimento formal e projetado das ontologias é fundido com a semântica latente dos dados sociais. Nesta dissertação apresentamos nossa ontologia folksonomizada e seus desdobramentos. Nós introduzimos aqui um framework formal para a análise de trabalhos relacionados, a fim de confrontá-los com a nossa abordagem. Além das melhorias nas operações de indexação e descoberta, que foram validadas em experimentos práticos, nós propomos uma técnica chamada 3E Steps para dar suporte à evolução de ontologias usando dados de folksonomias. Nós também implementamos o protótipo de uma ferramenta para a construção de ontologias folksonomizadas e para dar suporte à revisão de ontologias.
|
Gatto, Sandro Danilo;
Santanchè, André
Multi-representation Lens for Visual Analytics (conference)
Proceedings of ICDE,
IEEE,
2012.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Gatto2012,
abstract = {Modern data analysis deeply relies on computational visualization tools, specially when spatial data is involved. Important efforts in governmental and private agencies are looking for patterns and insights buried in dispersive, massive amounts of data (conventional, spatiotemporal, etc.). In Visual Analytics users must be empowered to analyze data from different perspectives, integrating, transforming, aggregating and deriving new representations of conventional as well as spatial data. However, a challenge for visual analysis tools is how to articulate such wide variety of data models and formats, specially when multiple representations of geographic elements are involved. A usual approach is to convert data to a database - e.g., a multi-representation database - which centralizes and homogenizes them. This approach has restrictions when facing the dynamic and distributed model of the Web. In this paper we propose an on the fly and on demand multi-representation data integration and homogenization approach, named Lens, as an alternative that fits better with the Web. It combines a metamodel driven approach to transform data to a unifying multidimensional and multi-representation model, with a middleware-based architecture for seamless and on-the-fly data access, tailored to Visual Analytics.},
author = {Sandro Danilo Gatto and André Santanchè},
booktitle = {Proceedings of ICDE},
date = {2012-03-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/PID2162539.pdf},
publisher = {IEEE},
title = {Multi-representation Lens for Visual Analytics},
year = {2012}
}
Modern data analysis deeply relies on computational visualization tools, specially when spatial data is involved. Important efforts in governmental and private agencies are looking for patterns and insights buried in dispersive, massive amounts of data (conventional, spatiotemporal, etc.). In Visual Analytics users must be empowered to analyze data from different perspectives, integrating, transforming, aggregating and deriving new representations of conventional as well as spatial data. However, a challenge for visual analysis tools is how to articulate such wide variety of data models and formats, specially when multiple representations of geographic elements are involved. A usual approach is to convert data to a database - e.g., a multi-representation database - which centralizes and homogenizes them. This approach has restrictions when facing the dynamic and distributed model of the Web. In this paper we propose an on the fly and on demand multi-representation data integration and homogenization approach, named Lens, as an alternative that fits better with the Web. It combines a metamodel driven approach to transform data to a unifying multidimensional and multi-representation model, with a middleware-based architecture for seamless and on-the-fly data access, tailored to Visual Analytics.
|
Nakai, Alan Massaru
Novas Técnicas de Distribuição de Carga para Servidores Web Geograficamente Distribuídos (phdthesis)
Instituto de Computação - Unicamp,
phdthesis,
2012.
(
Abstract |
Links |
BibTeX |
Tags:
PhDThesis
)
@phdthesis{Nakai2012,
abstract = {A distribuição de carga é um problema intrínseco a sistemas distribuídos. Esta tese aborda este problema no contexto de servidores web geograficamente distribuídos. A replicação de servidores web em \\emph{datacenters} distribuídos geograficamente provê tolerância a falhas e a possibilidade de fornecer melhores tempos de resposta aos clientes. Uma questão chave em cenários como este é a eficiência da solução de distribuição de carga empregada para dividir a carga do sistema entre as réplicas do servidor. A distribuição de carga permite que os provedores façam melhor uso dos seus recursos, amenizando a necessidade de provisão extra e ajudando a tolerar picos de carga até que o sistema seja ajustado. O objetivo deste trabalho foi estudar e propor novas soluções de distribuição de carga para servidores web geograficamente distribuídos. Para isso, foram implementadas duas ferramentas para apoiar a análise e o desenvolvimento de novas soluções, uma plataforma de testes construída sobre a implementação real de um serviço web e um software de simulação baseado em um modelo realístico de geração de carga para web. As principais contribuições desta tese são as propostas de quatro novas soluções de distribuição de carga que abrangem três diferentes tipos: soluções baseadas em DNS, baseadas em clientes e baseadas em servidores.},
author = {Alan Massaru Nakai},
date = {2012-01-01},
keyword = {PhDThesis},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/tese_nakai2012_final.pdf},
school = {Instituto de Computação - Unicamp},
title = {Novas Técnicas de Distribuição de Carga para Servidores Web Geograficamente Distribuídos},
year = {2012}
}
A distribuição de carga é um problema intrínseco a sistemas distribuídos. Esta tese aborda este problema no contexto de servidores web geograficamente distribuídos. A replicação de servidores web em \emph{datacenters} distribuídos geograficamente provê tolerância a falhas e a possibilidade de fornecer melhores tempos de resposta aos clientes. Uma questão chave em cenários como este é a eficiência da solução de distribuição de carga empregada para dividir a carga do sistema entre as réplicas do servidor. A distribuição de carga permite que os provedores façam melhor uso dos seus recursos, amenizando a necessidade de provisão extra e ajudando a tolerar picos de carga até que o sistema seja ajustado. O objetivo deste trabalho foi estudar e propor novas soluções de distribuição de carga para servidores web geograficamente distribuídos. Para isso, foram implementadas duas ferramentas para apoiar a análise e o desenvolvimento de novas soluções, uma plataforma de testes construída sobre a implementação real de um serviço web e um software de simulação baseado em um modelo realístico de geração de carga para web. As principais contribuições desta tese são as propostas de quatro novas soluções de distribuição de carga que abrangem três diferentes tipos: soluções baseadas em DNS, baseadas em clientes e baseadas em servidores.
|
Mota, Matheus Silva;
Medeiros, Claudia Bauzer
Introducing Shadows: Flexible Document Representation and Annotation on the Web (conference)
4th International Workshop on Data Engineering Meets the Semantic Web (DESWEB) -- co-located with 29th IEEE International Conference on Data Engineering (ICDE2013),
IEEE,
2012.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Mota2012b,
abstract = {The Web is witnessing an exponential growth of increasingly complex, distributed and heterogeneous documents. This hampers document exchange, as well as their annotation and retrieval. While information retrieval mechanisms concentrate on textual features (corpus analysis), annotation approaches either target specific formats or require that a document follows interoperable standards. This work presents our effort to handle these problems, providing a more flexible solution. Rather than trying to modify or convert the document itself, or to target only textual characteristics, the strategy described in this work is based on an intermediate descriptor -- the document shadow. A shadow represents domain-relevant aspects and elements of both structure and content of a given document, as defined by a user group. Rather than annotating documents themselves, it is the shadows that are annotated, thereby providing independence between annotations and document formats. Our annotations take advantage of the LOD initiative. Via annotations users can derive correlations across shadows, in a flexible way. Moreover, shadows and annotations are stored in databases, therefore allowing uniform database treatments of heterogeneous documents.},
author = {Matheus Silva Mota and Claudia Bauzer Medeiros},
booktitle = {4th International Workshop on Data Engineering Meets the Semantic Web (DESWEB) -- co-located with 29th IEEE International Conference on Data Engineering (ICDE2013)},
date = {2012-01-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/ICDEW13wkx_DESWEB_04.pdf},
publisher = {IEEE},
title = {Introducing Shadows: Flexible Document Representation and Annotation on the Web},
year = {2012}
}
The Web is witnessing an exponential growth of increasingly complex, distributed and heterogeneous documents. This hampers document exchange, as well as their annotation and retrieval. While information retrieval mechanisms concentrate on textual features (corpus analysis), annotation approaches either target specific formats or require that a document follows interoperable standards. This work presents our effort to handle these problems, providing a more flexible solution. Rather than trying to modify or convert the document itself, or to target only textual characteristics, the strategy described in this work is based on an intermediate descriptor -- the document shadow. A shadow represents domain-relevant aspects and elements of both structure and content of a given document, as defined by a user group. Rather than annotating documents themselves, it is the shadows that are annotated, thereby providing independence between annotations and document formats. Our annotations take advantage of the LOD initiative. Via annotations users can derive correlations across shadows, in a flexible way. Moreover, shadows and annotations are stored in databases, therefore allowing uniform database treatments of heterogeneous documents.
|
Malaverri, Joana E. G.;
Medeiros, Claudia Bauzer;
Lamparelli, Rubens Camargo
A Provenance Approach to Assess the Quality of Geospatial Data (conference)
27th Symposium On Applied Computing (SAC),
2012.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Malaverri2012b,
abstract = {Geographic information is present in our daily lives. This pervasiveness is also at the origin of several problems, including heterogeneity and trustworthiness -- of the data sources, of the data providers, and of the data products derived from the original sources. Most efforts to improve this situation concentrate on establishing data collection and curation standards, and quality metadata. This paper extends these efforts by presenting an approach to assess quality of geospatial data based on provenance.},
author = {Joana E. G. Malaverri and Claudia Bauzer Medeiros and Rubens Camargo Lamparelli},
booktitle = {27th Symposium On Applied Computing (SAC)},
date = {2012-01-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/artigo.pdf},
title = {A Provenance Approach to Assess the Quality of Geospatial Data},
year = {2012}
}
Geographic information is present in our daily lives. This pervasiveness is also at the origin of several problems, including heterogeneity and trustworthiness -- of the data sources, of the data providers, and of the data products derived from the original sources. Most efforts to improve this situation concentrate on establishing data collection and curation standards, and quality metadata. This paper extends these efforts by presenting an approach to assess quality of geospatial data based on provenance.
|
Longo, João Sávio C.;
Camargo, Luís Theodoro O.;
Medeiros, Claudia Bauzer;
Santanchè, André
Using the DBV model to maintain versions of multi-scale geospatial data (conference)
Advances in Conceptual Modeling,
Springer-Verlag,
2012.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Longo2012,
abstract = {Work on multi-scale issues concerning geospatial data presents countless challenges that have been long attacked by GIScience researchers. Indeed, a given real world problem must often be studied at distinct scales in order to be solved. Most implementation solutions go either towards generalization (and/or virtualization of distinct scales) or towards linking entities of interest across scales. In this context, the possibility of maintaining the history of changes at each scale is another factor to be considered. This paper presents our solution to these issues, which accommodates all previous research on handling multiple scales into a unifying framework. Our solution builds upon a specific database version model -- the multiversion MVDB -- which has already been successfully implemented in several geospatial scenarios, being extended here to support multi-scale research. The paper also presents our implementation of of a framework based on the model to handle and keep track of multi-scale data evolution.},
author = {João Sávio C. Longo and Luís Theodoro O. Camargo and Claudia Bauzer Medeiros and André Santanchè},
booktitle = {Advances in Conceptual Modeling},
date = {2012-01-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/dbv_multi_scale_api_lis.pdf},
pages = {284-293},
publisher = {Springer-Verlag},
title = {Using the DBV model to maintain versions of multi-scale geospatial data},
volume = {7518},
year = {2012}
}
Work on multi-scale issues concerning geospatial data presents countless challenges that have been long attacked by GIScience researchers. Indeed, a given real world problem must often be studied at distinct scales in order to be solved. Most implementation solutions go either towards generalization (and/or virtualization of distinct scales) or towards linking entities of interest across scales. In this context, the possibility of maintaining the history of changes at each scale is another factor to be considered. This paper presents our solution to these issues, which accommodates all previous research on handling multiple scales into a unifying framework. Our solution builds upon a specific database version model -- the multiversion MVDB -- which has already been successfully implemented in several geospatial scenarios, being extended here to support multi-scale research. The paper also presents our implementation of of a framework based on the model to handle and keep track of multi-scale data evolution.
|
Gomes, Alessandra;
Santanchè, André
Web-based Lab For Taxonomic Description (conference)
Anais do XI Workshop de Ferramentas e Aplicações - WebMedia,
2012.
(
Links |
BibTeX |
Tags:
Conference
)
@conference{Gomes2012,
author = {Alessandra Gomes and André Santanchè},
booktitle = {Anais do XI Workshop de Ferramentas e Aplicações - WebMedia},
date = {2012-01-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/Paper-TaxonomicLab-Alessandra-Andre-WFA2012.pdf},
title = {Web-based Lab For Taxonomic Description},
year = {2012}
}
|
Bernardo, Ivelize Rocha;
Santanchè, André;
Baranauskas, Maria Cecília Calani
Reconhecendo Padrões em Planilhas no domínio de uso da Biologia (conference)
Simpósio Brasileiro de Sistemas de Informação (SBSI),
2012.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Bernardo2012,
abstract = {Most of research data handled by biologists are in electronic spreadsheets. They became a popular technique to create data tables, which are easy to implement as isolated entities, but are inappropriate for integration with other spreadsheets or data sources and for enhanced queries, due to the informality of their implicit schemas. Several initiatives aim to interpret these implicit schemas of spreadsheets, making them explicit in order to drive the extraction and mapping of native data to open standards of interoperability. However, we observed limitations in such interpretation process, which is detached of the spreadsheet creation context. In this paper we present a strategy for characterizing spreadsheets, centered in their creation context, and we investigate how this characterization can be used to improve an automated interpretation and mapping of their respective schemas in the Biology usage domain. The strategy presented here is supporting a work in progress of a tool to automatically recognize spreadsheet schemas.},
author = {Ivelize Rocha Bernardo and André Santanchè and Maria Cecília Calani Baranauskas},
booktitle = {Simpósio Brasileiro de Sistemas de Informação (SBSI)},
date = {2012-01-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/sbbd_shp_33.pdf},
pages = {360-371},
title = {Reconhecendo Padrões em Planilhas no domínio de uso da Biologia},
year = {2012}
}
Most of research data handled by biologists are in electronic spreadsheets. They became a popular technique to create data tables, which are easy to implement as isolated entities, but are inappropriate for integration with other spreadsheets or data sources and for enhanced queries, due to the informality of their implicit schemas. Several initiatives aim to interpret these implicit schemas of spreadsheets, making them explicit in order to drive the extraction and mapping of native data to open standards of interoperability. However, we observed limitations in such interpretation process, which is detached of the spreadsheet creation context. In this paper we present a strategy for characterizing spreadsheets, centered in their creation context, and we investigate how this characterization can be used to improve an automated interpretation and mapping of their respective schemas in the Biology usage domain. The strategy presented here is supporting a work in progress of a tool to automatically recognize spreadsheet schemas.
|
Alves, Hugo;
Santanchè, André
Abstract Framework for Social Ontologies and Folksonomized Ontologies (conference)
4th International Workshop on Semantic Web Information Management,
SWIM,
2012.
(
Links |
BibTeX |
Tags:
Conference
)
@conference{Alves2012b,
address = {SWIM},
author = {Hugo Alves and André Santanchè},
booktitle = {4th International Workshop on Semantic Web Information Management},
date = {2012-01-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/swim2012.pdf},
title = {Abstract Framework for Social Ontologies and Folksonomized Ontologies},
year = {2012}
}
|
2011 |
Mota, Matheus Silva;
Longo, João Sávio Ceregatti;
Cugler, Daniel Cintra;
Medeiros, Claudia Bauzer
Using linked data to extract geo-knowledge (conference)
XII Brazilian Symposium on GeoInformatics - GeoInfo,
2011.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Mota2011,
abstract = {There are several approaches to extract geo-knowledge from documents and textual fields in databases. Most of them focus on detecting geographic evidence, from which the associated geographic location can be determined. This paper is based on a different premise -- geo-knowledge can be extracted even from non-geographic evidence, taking advantage of the linked data paradigm. The paper gives an overview of our approach and presents two case studies to extract geo-knowledge from documents and databases in the biodiversity domain.},
author = {Matheus Silva Mota and João Sávio Ceregatti Longo and Daniel Cintra Cugler and Claudia Bauzer Medeiros},
booktitle = {XII Brazilian Symposium on GeoInformatics - GeoInfo},
date = {2011-11-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/paper.pdf},
title = {Using linked data to extract geo-knowledge},
year = {2011}
}
There are several approaches to extract geo-knowledge from documents and textual fields in databases. Most of them focus on detecting geographic evidence, from which the associated geographic location can be determined. This paper is based on a different premise -- geo-knowledge can be extracted even from non-geographic evidence, taking advantage of the linked data paradigm. The paper gives an overview of our approach and presents two case studies to extract geo-knowledge from documents and databases in the biodiversity domain.
|
Mota, Matheus;
Medeiros, Claudia Bauzer
Shadow-driven Document Representation: A summarization-based strategy to represent non-interoperable documents (conference)
XI Workshop on Ongoing Thesis and Dissertations - WebMedia,
SBC,
2011.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Mota2011b,
abstract = {Document production tools are present everywhere, resulting in an exponential growth of increasingly complex, dis- tributed and heterogeneous documents. This hampers document exchange, as well as their annotation, indexing and retrieval. Existing approaches to these tasks either concentrate on specific formats or require representing document’s content using interoperable standards or schema. This work presents our effort to handle this problem. Rather than try- ing to modify or convert the document itself, our strategy defines an intermediate and interoperable descriptor – shadow – that summarizes key aspects and elements of a given document, improving its annotation, indexation and retrieval process regardless of its format. Shadows can be used with different purposes, from semantic annotations and context- sensitive annotations, to content indexation and clustering.},
author = {Matheus Mota and Claudia Bauzer Medeiros},
booktitle = {XI Workshop on Ongoing Thesis and Dissertations - WebMedia},
date = {2011-10-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/paper-5.pdf},
pages = {4},
publisher = {SBC},
title = {Shadow-driven Document Representation: A summarization-based strategy to represent non-interoperable documents},
year = {2011}
}
Document production tools are present everywhere, resulting in an exponential growth of increasingly complex, dis- tributed and heterogeneous documents. This hampers document exchange, as well as their annotation, indexing and retrieval. Existing approaches to these tasks either concentrate on specific formats or require representing document’s content using interoperable standards or schema. This work presents our effort to handle this problem. Rather than try- ing to modify or convert the document itself, our strategy defines an intermediate and interoperable descriptor – shadow – that summarizes key aspects and elements of a given document, improving its annotation, indexation and retrieval process regardless of its format. Shadows can be used with different purposes, from semantic annotations and context- sensitive annotations, to content indexation and clustering.
|
Alves, Hugo;
Santanchè, André
Folksonomized Ontologies - from social to formal (conference)
Proceedings of XVII Brazilian Symposium on Multimedia and the Web,
2011.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Alves2011,
abstract = {An ever-increasing number of web-based repositories aimed at sharing content, links or metadata rely on tags informed by users to describe, classify and organize their data. The term folksonomy has been used to define this "social taxonomy", which emerges from tagging carried by users interacting in social environments. It contrasts with the formalism and systematic creation process applied to ontologies. In our research we propose that ontologies and folksonomies have complementary roles. The knowledge systematically organized and formalized in ontologies can be enriched and contextualized by the implicit knowledge which emerges from folksonomies. This paper presents our approach to build a "folksonomized" ontology as a confluence of a formal ontology enriched with social knowledge extracted from folksonomies. The formal embodiment of folksonomies has been explored to empower content search and classification. On the other hand, ontologies are supplied with contextual data, which can improve relationship weighting and inference operations. The paper shows a tool we have implemented to produce and use folksonomized ontologies. It was used to attest that searching operations can be improved by this combination of ontologies with folksonomies.},
author = {Hugo Alves and André Santanchè},
booktitle = {Proceedings of XVII Brazilian Symposium on Multimedia and the Web},
date = {2011-10-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/folksonomized-ontologies.pdf},
title = {Folksonomized Ontologies - from social to formal},
year = {2011}
}
An ever-increasing number of web-based repositories aimed at sharing content, links or metadata rely on tags informed by users to describe, classify and organize their data. The term folksonomy has been used to define this 'social taxonomy', which emerges from tagging carried by users interacting in social environments. It contrasts with the formalism and systematic creation process applied to ontologies. In our research we propose that ontologies and folksonomies have complementary roles. The knowledge systematically organized and formalized in ontologies can be enriched and contextualized by the implicit knowledge which emerges from folksonomies. This paper presents our approach to build a 'folksonomized' ontology as a confluence of a formal ontology enriched with social knowledge extracted from folksonomies. The formal embodiment of folksonomies has been explored to empower content search and classification. On the other hand, ontologies are supplied with contextual data, which can improve relationship weighting and inference operations. The paper shows a tool we have implemented to produce and use folksonomized ontologies. It was used to attest that searching operations can be improved by this combination of ontologies with folksonomies.
|
Koga, Ivo;
Medeiros, Claudia Bauzer;
Branquinho, Omar
Handling and Publishing Wireless Sensor Network Data: a hands-on experiment (article)
Journal of Computational Interdisciplinary Sciences (JCIS),
1,
2011.
(
Abstract |
Links |
BibTeX |
Tags:
Article
)
@article{Koga2011,
abstract = {eScience research, in computer science, concerns the development of tools, models and techniques to help scientists from other domains to develop their own research. One problem which is common to all fields is concerned with the management of heterogeneous data, offer- ing multiple interaction possibilities. This paper presents a proposal to help solve this problem, tailored to wireless sensor data – an im- portant data source in eScience. This proposal is illustrated with a case study.},
author = {Ivo Koga and Claudia Bauzer Medeiros and Omar Branquinho},
date = {2011-09-01},
journal = {Journal of Computational Interdisciplinary Sciences (JCIS)},
keyword = {Article},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/2011-09-20-publicado-JCIS-v2n1a02.pdf},
note = {J. Comp. Int. Sci., Volume 2, Issue 1, 2011, 13-22, pdn: jcis.2011.02.01.0028 © copyright 2011 PACIS [http://epacis.net/jcis.php]},
number = {1},
pages = {13-22},
title = {Handling and Publishing Wireless Sensor Network Data: a hands-on experiment},
volume = {2},
year = {2011}
}
eScience research, in computer science, concerns the development of tools, models and techniques to help scientists from other domains to develop their own research. One problem which is common to all fields is concerned with the management of heterogeneous data, offer- ing multiple interaction possibilities. This paper presents a proposal to help solve this problem, tailored to wireless sensor data – an im- portant data source in eScience. This proposal is illustrated with a case study.
|
Jomier, Genevieve;
Medeiros, Claudia Bauzer;
Santanche, Andre
The Multi-focus approach: multidisciplinary cooperations on the Web (Position paper) (article)
Proc. II Workshop of the INCT on Web Science,
2011.
(
Abstract |
Links |
BibTeX |
Tags:
Article
)
@article{Jomier2011,
abstract = {This paper is concerned with discussing issues associated with the emerging paradigm of collaborative scientific environments on the Web, and on challenges facing teams with complementary expertise, who work across the Web. The emphasis is on the multiple focuses in which these groups attack a problem, and how this can be approached from a spatio-temporal database perspective.},
author = {Genevieve Jomier and Claudia Bauzer Medeiros and Andre Santanche},
date = {2011-07-01},
journal = {Proc. II Workshop of the INCT on Web Science},
keyword = {Article},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/INCT-WEB2011.pdf},
note = {Paper presented at the II Workshop at the Brazilian Institute of Web Science},
title = {The Multi-focus approach: multidisciplinary cooperations on the Web (Position paper)},
year = {2011}
}
This paper is concerned with discussing issues associated with the emerging paradigm of collaborative scientific environments on the Web, and on challenges facing teams with complementary expertise, who work across the Web. The emphasis is on the multiple focuses in which these groups attack a problem, and how this can be approached from a spatio-temporal database perspective.
|
Cugler, Daniel Cintra;
Medeiros, Claudia Bauzer;
Toledo, Luís Felipe
Managing Animal Sounds - Some Challenges and Research Directions (conference)
Proceedings V eScience Workshop - XXXI Brazilian Computer Society Conference,
2011.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Cugler2011,
abstract = {For decades, biologists around the world have recorded animal sounds. As the number of records grows, so does the difficulty to manage them, presenting challenges to save, retrieve, share and manage the sounds. This paper presents our preliminary results concerning management of large volumes of animal sound data. The paper also provides an overview from our prototype, an online environment focused on management of this data. This paper also discusses our case study, concerning more than 1 terabyte of animal recordings from Fonoteca Neotropical Jacques Vielliard, at UNICAMP, Brazil.},
author = {Daniel Cintra Cugler and Claudia Bauzer Medeiros and Luís Felipe Toledo},
booktitle = {Proceedings V eScience Workshop - XXXI Brazilian Computer Society Conference},
date = {2011-07-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/CSBC.pdf},
title = {Managing Animal Sounds - Some Challenges and Research Directions},
year = {2011}
}
For decades, biologists around the world have recorded animal sounds. As the number of records grows, so does the difficulty to manage them, presenting challenges to save, retrieve, share and manage the sounds. This paper presents our preliminary results concerning management of large volumes of animal sound data. The paper also provides an overview from our prototype, an online environment focused on management of this data. This paper also discusses our case study, concerning more than 1 terabyte of animal recordings from Fonoteca Neotropical Jacques Vielliard, at UNICAMP, Brazil.
|
Senra, Rodrigo Dias Arruda;
Medeiros, Claudia Bauzer
ORGANOGRAPHS Multi-faceted Hierarchical Categorization of Web Documents (conference)
Proceedings WEBIST - 7th International Conference on Web Information Systems,
INSTICC,
2011.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Senra2011,
abstract = {The data deluge of information in the Web challenges internauts to organize their references to interesting content in theWeb as well as in their private storage space off-line. Having an automatically managed personal index to content acquired from theWeb is useful for everybody, but critical to researchers and scholars. In this paper, we discuss concepts and problems related to organizing information through multi-faceted hierarchical categorization. We introduce the organograph as a mechanism to specify multiple views of how content is organized. Organographs can help scientists to automatically organize their documents along multiple axes, improving sharing and navigation through themes and concepts according to a particular research objective.},
author = {Rodrigo Dias Arruda Senra and Claudia Bauzer Medeiros},
booktitle = {Proceedings WEBIST - 7th International Conference on Web Information Systems},
date = {2011-05-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/WebOrganization.pdf},
publisher = {INSTICC},
title = {ORGANOGRAPHS Multi-faceted Hierarchical Categorization of Web Documents},
year = {2011}
}
The data deluge of information in the Web challenges internauts to organize their references to interesting content in theWeb as well as in their private storage space off-line. Having an automatically managed personal index to content acquired from theWeb is useful for everybody, but critical to researchers and scholars. In this paper, we discuss concepts and problems related to organizing information through multi-faceted hierarchical categorization. We introduce the organograph as a mechanism to specify multiple views of how content is organized. Organographs can help scientists to automatically organize their documents along multiple axes, improving sharing and navigation through themes and concepts according to a particular research objective.
|
Nakai, Alan Massaru;
Madeira, Edmundo;
Buzato, Luiz E.
Improving the QoS of Web Services via Client-Based Load Distribution (conference)
XXIX Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos (Aceito para apresentação),
2011.
(
Abstract |
BibTeX |
Tags:
Conference
)
@conference{Nakai2011,
abstract = {The replication of a web service over geographically distributed locations can improve the QoS perceived by its clients. An important issue in such a deployment is the efficiency of the policy applied to distribute client requests among the replicas. In this paper, we propose a new approach for client-based load distribution that adaptively changes the fraction of load each client submits to each service replica to try to minimize overall response times. Our results show that the proposed strategy can achieve better response times than algorithms that eagerly try to choose the best replica for each client.},
author = {Alan Massaru Nakai and Edmundo Madeira and Luiz E. Buzato},
booktitle = {XXIX Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos (Aceito para apresentação)},
date = {2011-05-01},
keyword = {Conference},
note = {Aceito para apresentação},
title = {Improving the QoS of Web Services via Client-Based Load Distribution},
year = {2011}
}
The replication of a web service over geographically distributed locations can improve the QoS perceived by its clients. An important issue in such a deployment is the efficiency of the policy applied to distribute client requests among the replicas. In this paper, we propose a new approach for client-based load distribution that adaptively changes the fraction of load each client submits to each service replica to try to minimize overall response times. Our results show that the proposed strategy can achieve better response times than algorithms that eagerly try to choose the best replica for each client.
|
Fedel, Gabriel de Souza
Busca multimodal para apoio à pesquisa em biodiversidade (mastersthesis)
Instituto de Computação - Unicamp,
mastersthesis,
2011.
(
Abstract |
Links |
BibTeX |
Tags:
Mastersthesis
)
@mastersthesis{deFedel2011,
abstract = {A pesquisa em computação aplicada à biodiversidade apresenta muitos desafios, como a existência de grande quantidade de dados e sua heterogeneidade e variedade. As ferramentas de busca disponíveis para tais dados ainda são limitadas e normalmente só consideram dados textuais, deixando de explorar a potencialidade da busca por dados de outra natureza, como imagens ou sons. O objetivo deste projeto é analisar os problemas de realizar consultas multimodais com texto e imagem para o domínio de biodiversidade, propondo um conjunto de ferramentas para processar tais consultas. Espera-se que com esta busca integrada a recuperação dos dados de biodiversidade se torne mais abrangente, auxiliando os pesquisadores em biodiversidade em suas tarefas, além de incentivar que usuários leigos acessem esses dados. Este trabalho está inserido no projeto BioCORE, uma parceria entre pesquisadores de computação e biologia para aperfeiçoar a pesquisa em biodiversidade.},
author = {Gabriel de Souza Fedel},
date = {2011-04-01},
keyword = {Mastersthesis},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/dissertacao2.pdf},
school = {Instituto de Computação - Unicamp},
title = {Busca multimodal para apoio à pesquisa em biodiversidade},
year = {2011}
}
A pesquisa em computação aplicada à biodiversidade apresenta muitos desafios, como a existência de grande quantidade de dados e sua heterogeneidade e variedade. As ferramentas de busca disponíveis para tais dados ainda são limitadas e normalmente só consideram dados textuais, deixando de explorar a potencialidade da busca por dados de outra natureza, como imagens ou sons. O objetivo deste projeto é analisar os problemas de realizar consultas multimodais com texto e imagem para o domínio de biodiversidade, propondo um conjunto de ferramentas para processar tais consultas. Espera-se que com esta busca integrada a recuperação dos dados de biodiversidade se torne mais abrangente, auxiliando os pesquisadores em biodiversidade em suas tarefas, além de incentivar que usuários leigos acessem esses dados. Este trabalho está inserido no projeto BioCORE, uma parceria entre pesquisadores de computação e biologia para aperfeiçoar a pesquisa em biodiversidade.
|
Nakai, Alan Massaru;
Madeira, Edmundo;
Buzato, Luiz E.
Load Balancing for Internet Distributed Services using Limited Redirection Rates (conference)
Proceedings of the 5th Latin-American Symposium on Dependable Computing,
2011.
(
Abstract |
BibTeX |
Tags:
Conference
)
@conference{Nakai2011b,
abstract = {The Internet has become the universal support for computer applications. This increases the need for solutions that provide dependability and QoS for web applications. The replication of web servers on geographically distributed datacenters allows the service provider to tolerate disastrous failures and to improve the response times perceived by clients. A key issue for good performance of worldwide distributed web services is the efficiency of the load balancing mechanism used to distribute client requests among the replicated servers. Load balancing can reduce the need for over-provision of resources, and help tolerate abrupt load peaks and/or partial failures through load conditioning. In this paper, we propose a new load balancing solution that reduces service response times by redirecting requests to the closest remote servers without overloading them. We also describe a middleware that implements this protocol and present the results of a set of simulations that show its usefulness.},
author = {Alan Massaru Nakai and Edmundo Madeira and Luiz E. Buzato},
booktitle = {Proceedings of the 5th Latin-American Symposium on Dependable Computing},
date = {2011-04-01},
keyword = {Conference},
title = {Load Balancing for Internet Distributed Services using Limited Redirection Rates},
year = {2011}
}
The Internet has become the universal support for computer applications. This increases the need for solutions that provide dependability and QoS for web applications. The replication of web servers on geographically distributed datacenters allows the service provider to tolerate disastrous failures and to improve the response times perceived by clients. A key issue for good performance of worldwide distributed web services is the efficiency of the load balancing mechanism used to distribute client requests among the replicated servers. Load balancing can reduce the need for over-provision of resources, and help tolerate abrupt load peaks and/or partial failures through load conditioning. In this paper, we propose a new load balancing solution that reduces service response times by redirecting requests to the closest remote servers without overloading them. We also describe a middleware that implements this protocol and present the results of a set of simulations that show its usefulness.
|
Gomes, Alessandra;
Santanchè, André
Autoria Virtual Baseada no Mundo Real (conference)
Anais do X Workshop de Ferramentas e Aplicações - WebMedia,
2011.
(
Links |
BibTeX |
Tags:
Conference
)
@conference{eSantanche2011,
author = {Alessandra Gomes and André Santanchè},
booktitle = {Anais do X Workshop de Ferramentas e Aplicações - WebMedia},
date = {2011-01-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/autoria-virtual-baseada-em-dados-do-mundo-real.pdf},
title = {Autoria Virtual Baseada no Mundo Real},
year = {2011}
}
|
Costa, Taluna Mendes d'Araújo;
Santanchè, André
Padrão de Anotação Semântica de Código e sua Aplicação no Desenvolvimento de Componentes (conference)
Anais do V Simpósio Brasileiro de Componentes, Arquiteturas e Reutilização de Software,
2011.
(
Links |
BibTeX |
Tags:
Conference
)
@conference{dAraujoeSantanche2011,
author = {Taluna Mendes d'Araújo Costa and André Santanchè},
booktitle = {Anais do V Simpósio Brasileiro de Componentes, Arquiteturas e Reutilização de Software},
date = {2011-01-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/anotacao-semantica-codigo1.pdf},
title = {Padrão de Anotação Semântica de Código e sua Aplicação no Desenvolvimento de Componentes},
year = {2011}
}
|
Medeiros, Claudia Bauzer;
Santanche, Andre;
Madeira, Edmundo;
Martins, Eliane;
Magalhaes, Geovane;
Baranauskas, Maria Cecilia;
Leite, Neucimar;
Torres, Ricardo da Silva
Data Driven Research at LIS: the Laboratory of Information Systems at UNICAMP (article)
JIDM,
2,
2011.
(
Abstract |
Links |
BibTeX |
Tags:
Article
)
@article{Medeiros2011,
abstract = {This article presents an overview of the research conducted at the Laboratory of Information Systems (LIS) at the Institute of Computing, UNICAMP. Its creation, in 1994, was motivated by the need to support data-driven research within multidisciplinary projects involving computer scientists and scientists from other fields. Throughout the years, it has housed projects in many domains - in agriculture, biodiversity, medicine, health, bioinformatics, urban planning, telecommunications, and sports - with scientific results in these fields and in Computer Science, with emphasis in data management, integrating research on databases, image processing, human-computer interfaces, software engineering and computer networks. The research produced 14 PhD theses, 70 MSc dissertations, 40$+$ journal papers and 200$+$ conference papers, having been assisted by over 80 undergraduate student scholarships. Several of these results were obtained through cooperation with many Brazilian universities and research centers, as well as groups in Canada, USA, France, Germany, the Netherlands and Portugal. The authors of this article are faculty at the Institute whose students developed their MSc or PhD research in the lab. For additional details, online systems, papers and reports, see http://www.lis.ic.unicamp.br and http://www.lis.ic.unicamp.br/publications},
author = {Claudia Bauzer Medeiros and Andre Santanche and Edmundo Madeira and Eliane Martins and Geovane Magalhaes and Maria Cecilia Baranauskas and Neucimar Leite and Ricardo da Silva Torres},
date = {2011-01-01},
journal = {JIDM},
keyword = {Article},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/Versao-FINAL.pdf},
number = {2},
pages = {93-108},
title = {Data Driven Research at LIS: the Laboratory of Information Systems at UNICAMP},
volume = {2},
year = {2011}
}
This article presents an overview of the research conducted at the Laboratory of Information Systems (LIS) at the Institute of Computing, UNICAMP. Its creation, in 1994, was motivated by the need to support data-driven research within multidisciplinary projects involving computer scientists and scientists from other fields. Throughout the years, it has housed projects in many domains - in agriculture, biodiversity, medicine, health, bioinformatics, urban planning, telecommunications, and sports - with scientific results in these fields and in Computer Science, with emphasis in data management, integrating research on databases, image processing, human-computer interfaces, software engineering and computer networks. The research produced 14 PhD theses, 70 MSc dissertations, 40$+$ journal papers and 200$+$ conference papers, having been assisted by over 80 undergraduate student scholarships. Several of these results were obtained through cooperation with many Brazilian universities and research centers, as well as groups in Canada, USA, France, Germany, the Netherlands and Portugal. The authors of this article are faculty at the Institute whose students developed their MSc or PhD research in the lab. For additional details, online systems, papers and reports, see http://www.lis.ic.unicamp.br and http://www.lis.ic.unicamp.br/publications
|
Mariote, Leonardo;
Medeiros, Claudia Bauzer;
Torres, Ricardo da Silva;
Bueno, Lucas M.
TIDES—a new descriptor for time series oscillation behavior (article)
Geoinformatica,
2011.
(
Abstract |
BibTeX |
Tags:
Article
)
@article{Mariote2011,
abstract = {Sensor networks have increased the amount and variety of temporal data available, requiring the definition of new techniques for data mining. Related research typically addresses the problems of indexing, clustering, classification, summarization, and anomaly detection. There is a wide range of techniques to describe and compare time series, but they focus on series’ values. This paper concentrates on a new aspect—that of describing oscillation patterns. It presents a technique for time series similarity search, and multiple temporal scales, defining a descriptor that uses the angular coefficients from a linear segmentation of the curve that represents the evolution of the analyzed series. This technique is generalized to handle co-evolution, in which several phenomena vary at the same time. Preliminary experiments with real datasets showed that our approach correctly characterizes the oscillation of single time series, for multiple time scales, and is able to compute the similarity among sets of co-evolving series.},
author = {Leonardo Mariote and Claudia Bauzer Medeiros and Ricardo da Silva Torres and Lucas M. Bueno},
date = {2011-01-01},
journal = {Geoinformatica},
keyword = {Article},
pages = {75-109},
title = {TIDES—a new descriptor for time series oscillation behavior},
volume = {15},
year = {2011}
}
Sensor networks have increased the amount and variety of temporal data available, requiring the definition of new techniques for data mining. Related research typically addresses the problems of indexing, clustering, classification, summarization, and anomaly detection. There is a wide range of techniques to describe and compare time series, but they focus on series’ values. This paper concentrates on a new aspect—that of describing oscillation patterns. It presents a technique for time series similarity search, and multiple temporal scales, defining a descriptor that uses the angular coefficients from a linear segmentation of the curve that represents the evolution of the analyzed series. This technique is generalized to handle co-evolution, in which several phenomena vary at the same time. Preliminary experiments with real datasets showed that our approach correctly characterizes the oscillation of single time series, for multiple time scales, and is able to compute the similarity among sets of co-evolving series.
|
2010 |
Carromeu, Camilo;
Medeiros, Claudia Bauzer
Spatial Monitoring of Cattle – Impact on the Carbon Cycle (conference)
Proc. GeoChange 2010 - Research Symposium GIScience for Environmental Change,
2010.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Carromeu2010,
abstract = {There is a growing demand for accurate information about the real environmental impact caused by cattle, accompanied by a concern for increased production of cattle related products in a sustainable manner. With the widespread adoption of RFID chips for bovine traceability and new technologies for measuring carbon dioxide in the atmosphere, it is now feasible to develop carbon cycle models that combine such factors. This presents challenges that range from data management to model specification and validation, to correlate animal movements and their impact on different biomes. This paper presents a proposal towards this goal, concerned with the creation of a framework to store and index semantic space trajectories of livestock to enable monitoring of the production of CO2.},
author = {Camilo Carromeu and Claudia Bauzer Medeiros},
booktitle = {Proc. GeoChange 2010 - Research Symposium GIScience for Environmental Change},
date = {2010-11-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/camilo-geochange_2010.pdf},
title = {Spatial Monitoring of Cattle – Impact on the Carbon Cycle},
year = {2010}
}
There is a growing demand for accurate information about the real environmental impact caused by cattle, accompanied by a concern for increased production of cattle related products in a sustainable manner. With the widespread adoption of RFID chips for bovine traceability and new technologies for measuring carbon dioxide in the atmosphere, it is now feasible to develop carbon cycle models that combine such factors. This presents challenges that range from data management to model specification and validation, to correlate animal movements and their impact on different biomes. This paper presents a proposal towards this goal, concerned with the creation of a framework to store and index semantic space trajectories of livestock to enable monitoring of the production of CO2.
|
Fedel, Gabriel de Souza;
Medeiros, Claudia Bauzer
Busca multimodal para apoio à pesquisa em biodiversidade (conference)
WTDBD - Workshop de Teses e Dissertações em Bancos de Dados,
2010.
(
Links |
BibTeX |
Tags:
Conference
)
@conference{deFedel2010,
author = {Gabriel de Souza Fedel and Claudia Bauzer Medeiros},
booktitle = {WTDBD - Workshop de Teses e Dissertações em Bancos de Dados},
date = {2010-10-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/ArtigoWTDBD2010.pdf},
title = {Busca multimodal para apoio à pesquisa em biodiversidade},
year = {2010}
}
|
Malaverri, Joana E. Gonzales;
Medeiros, Claudia Bauzer
Handling Provenance in Biodiversity (conference)
Workshop on Challenges in eScience (CIS),
2010.
(
Abstract |
Links |
BibTeX |
Tags:
Conference
)
@conference{Malaverri2010,
abstract = {One of the concerns in eScience research is the design and development of novel solutions to support distributed collaboration. In this context, regardless of the scientific domain, an important problem is the reproducibility of the results from scientific activities, considering the heterogeneous data involved and the specific research context. This paper presents a proposal to help solve this problem, proposing a software architecture to handle provenance issues.},
author = {Joana E. Gonzales Malaverri and Claudia Bauzer Medeiros},
booktitle = {Workshop on Challenges in eScience (CIS)},
date = {2010-10-01},
keyword = {Conference},
link = {http://www.lis.ic.unicamp.br/wp-content/uploads/2014/09/joanaCIS_CamRea.pdf},
title = {Handling Provenance in Biodiversity},
year = {2010}
}
One of the concerns in eScience research is the design and development of novel solutions to support distributed collaboration. In this context, regardless of the scientific domain, an important problem is the reproducibility of the results from scientific activities, considering the heterogeneous data involved and the specific research context. This paper presents a proposal to help solve this problem, proposing a software architecture to handle provenance issues.
|
Santanchè, André;
Baumann, Peter
Component-based Web Clients For Scientific Data Exploration Using The DCC Framework (conference)
GIScience 2010,
Zurich, Switzerland,
2010.
(
|