Belo Horizonte, Minas Gerais, Brasil
Informações de contato
1 mil seguidores
+ de 500 conexões
Sobre
Atividades
-
O Raja marcou presença no Evento Google Cloud para Startups! 🚀 Na última semana, tivemos a honra de participar de um evento incrível, organizado…
O Raja marcou presença no Evento Google Cloud para Startups! 🚀 Na última semana, tivemos a honra de participar de um evento incrível, organizado…
Bruno Possas gostou
-
Yesterday I had the honor of joining my colleagues in São Paulo at our sixth annual Google for Brazil event to share our latest AI innovations and…
Yesterday I had the honor of joining my colleagues in São Paulo at our sixth annual Google for Brazil event to share our latest AI innovations and…
Bruno Possas gostou
Experiência e formação acadêmica
Publicações
-
Concept-based interactive query expansion
Proceedings of the 14th ACM international conference on Information and knowledge management
Despite the recent advances in search quality, the fast increase in the size of the
Web collection has introduced new challenges for Web ranking algorithms. In fact, there are
still many situations in which the users are presented with imprecise or very poor results.
One of the key difficulties is the fact that users usually submit very short and ambiguous
queries, and they do not fully specify their information needs. That is, it is necessary to
improve the query formation…Despite the recent advances in search quality, the fast increase in the size of the
Web collection has introduced new challenges for Web ranking algorithms. In fact, there are
still many situations in which the users are presented with imprecise or very poor results.
One of the key difficulties is the fact that users usually submit very short and ambiguous
queries, and they do not fully specify their information needs. That is, it is necessary to
improve the query formation process if better answers are to be provided. In this work we ... -
Maximal termsets as a query structuring mechanism
Proceedings of the 14th ACM international conference on Information and knowledge management
Search engines process queries conjunctively to restrict the size of the answer set.
Further, it is not rare to observe a mismatch between the vocabulary used in the text of Web
pages and the terms used to compose the Web queries. The combination of these two
features might lead to irrelevant query results, particularly in the case of more specific
queries composed of three or more terms. To deal with this problem we propose a new
technique for automatically structuring Web…Search engines process queries conjunctively to restrict the size of the answer set.
Further, it is not rare to observe a mismatch between the vocabulary used in the text of Web
pages and the terms used to compose the Web queries. The combination of these two
features might lead to irrelevant query results, particularly in the case of more specific
queries composed of three or more terms. To deal with this problem we propose a new
technique for automatically structuring Web queries as a set of smaller subqueries. To ...
-
Set-based vector model: An efficient approach for correlation-based ranking
ACM Transactions on Information Systems (TOIS)
This work presents a new approach for ranking documents in the vector space
model. The novelty lies in two fronts. First, patterns of term co-occurrence are taken into
account and are processed efficiently. Second, term weights are generated using a data
mining technique called association rules. This leads to a new ranking mechanism called
the set-based vector model. The components of our model are no longer index terms but
index termsets, where a termset is a set of index…This work presents a new approach for ranking documents in the vector space
model. The novelty lies in two fronts. First, patterns of term co-occurrence are taken into
account and are processed efficiently. Second, term weights are generated using a data
mining technique called association rules. This leads to a new ranking mechanism called
the set-based vector model. The components of our model are no longer index terms but
index termsets, where a termset is a set of index terms. Termsets capture the intuition that ... -
Processing conjunctive and phrase queries with the set-based model
International Symposium on String Processing and Information Retrieval
The objective of this paper is to present an extension to the set-based model (SBM), which is an effective technique for computing term weights based on co-occurrence patterns, for processing conjunctive and phrase queries. The intuition that semantically related term occurrences often occur closer to each other is taken into consideration. The novelty is that all known approaches that account for co-occurrence patterns was initially designed for processing disjunctive (OR) queries, and our…
The objective of this paper is to present an extension to the set-based model (SBM), which is an effective technique for computing term weights based on co-occurrence patterns, for processing conjunctive and phrase queries. The intuition that semantically related term occurrences often occur closer to each other is taken into consideration. The novelty is that all known approaches that account for co-occurrence patterns was initially designed for processing disjunctive (OR) queries, and our extension provides a simple, effective and efficient way to process conjunctive (AND) and phrase queries. This technique is time efficient and yet yields nice improvements in retrieval effectiveness. Experimental results show that our extension improves the average precision of the answer set for all collection evaluated, keeping computational cost small. For the TReC-8 collection, our extension led to a gain, relative to the standard vector space model, of 23.32% and 18.98% in average precision curves for conjunctive and phrase queries, respectively.
-
Discovering search engine related queries using association rules
Journal of Web Engineering
This work presents a method for online generation of query related suggestions for
a Web search engine. The method uses association rules to extract related queries from the
log of sbumitted queries to the search engine. Experimental results were performed on a
real log containing more than 2.3 million queries submitted to a commercial search engine.
For the top 5 related terms our method presented correct suggestions in 90.5% of the time.
Using queries randomly selected…This work presents a method for online generation of query related suggestions for
a Web search engine. The method uses association rules to extract related queries from the
log of sbumitted queries to the search engine. Experimental results were performed on a
real log containing more than 2.3 million queries submitted to a commercial search engine.
For the top 5 related terms our method presented correct suggestions in 90.5% of the time.
Using queries randomly selected from a log we obtained 93.45% of correct suggestions. A ... -
Enhancing the set-based model using proximity information
Proceeding SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
SBM), which is an effective technique for computing term weights based on co-occurrence patterns, employing the information about the proximity among query terms in documents. The intuition that semantically related term occurrences often occur closer to each other is taken into consideration, leading to a new information retrieval model called proximity set-based model (PSBM). The novelty is that the proximity information is used as a pruning strategy to determine only related co-occurrence…
SBM), which is an effective technique for computing term weights based on co-occurrence patterns, employing the information about the proximity among query terms in documents. The intuition that semantically related term occurrences often occur closer to each other is taken into consideration, leading to a new information retrieval model called proximity set-based model (PSBM). The novelty is that the proximity information is used as a pruning strategy to determine only related co-occurrence term patterns. This technique is time efficient and yet yields nice improvements in retrieval effectiveness. Experimental results show that PSBM improves the average precision of the answer set for all four collections evaluated. For the CFC collection, PSBM leads to a gain relative to the standard vector space model (VSM), of 23% in average precision values and 55% in average precision for the top 10 documents. PSBM is also competitive in terms of computational performance, reducing the execution time of the SBM in 21% for the CISI collection.
-
Set-based model: A new approach for information retrieval
Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
The objective of this paper is to present a new technique for computing term
weights for index terms, which leads to a new ranking mechanism, referred to as set-based
model. The components in our model are no longer terms, but termsets. The novelty is that
we compute term weights using a data mining technique called association rules, which is
time efficient and yet yields nice improvements in retrieval effectiveness. The set-based
model function for computing the similarity…The objective of this paper is to present a new technique for computing term
weights for index terms, which leads to a new ranking mechanism, referred to as set-based
model. The components in our model are no longer terms, but termsets. The novelty is that
we compute term weights using a data mining technique called association rules, which is
time efficient and yet yields nice improvements in retrieval effectiveness. The set-based
model function for computing the similarity between a document and a query considers ... -
Mining frequent itemsets in evolving databases
Proceedings of the 2002 SIAM International Conference on Data Mining
The field of knowledge discovery and data mining (KDD), spurred by
advances in data collection technology, is concerned with the process of deriving interesting
and useful patterns from large datasets. The KDD process is computational and data-
intensive and is inherently interactive and iterative in nature. In fact, interactivity is often the
key to facilitating effective data understanding and knowledge discovery. In such an
environment, response time is crucial because…The field of knowledge discovery and data mining (KDD), spurred by
advances in data collection technology, is concerned with the process of deriving interesting
and useful patterns from large datasets. The KDD process is computational and data-
intensive and is inherently interactive and iterative in nature. In fact, interactivity is often the
key to facilitating effective data understanding and knowledge discovery. In such an
environment, response time is crucial because lengthy time delay between responses of ... -
Knowledge management in association rule mining
In Integrating Data Mining and Knowledge Management, held in conjunction with the 2001 IEE International Conference on Data Mining (ICDM
Most current work on discovery of association rules assumes that the database from
which the rules are determined is static. The mining operation is performed just once and
therefore there is no need of knowledge management integration techniques. However,
there are several domains where the database is updated on a regular basis. In these
dynamic databases, it is hard to maintain the discovered rules since the updates may not
only invalidate some existing rules but also…Most current work on discovery of association rules assumes that the database from
which the rules are determined is static. The mining operation is performed just once and
therefore there is no need of knowledge management integration techniques. However,
there are several domains where the database is updated on a regular basis. In these
dynamic databases, it is hard to maintain the discovered rules since the updates may not
only invalidate some existing rules but also make other rules relevant. We present an ... -
Mineração Assíncrona de Regras de Associação em Sistemas de Memória Compartilhada-Distribuída
Anais do 2o Workshop em Computação de Alto Desmpenho
Encontrar as regras de associação presentes em grandes bases de dados é um
importante problema em Mineração de Dados. Existe uma grande necessidade de
desenvolver algoritmos paralelos para esse problema, uma vez que ele corresponde a um
processo computacional muito custoso. No entanto, a maioria dos algoritmos propostos
para minerar tais regras seguem uma busca iterativa, que imp: íe a necessidade de
sincronização ao final de cada iteração, degradando o desempenho. Outra…Encontrar as regras de associação presentes em grandes bases de dados é um
importante problema em Mineração de Dados. Existe uma grande necessidade de
desenvolver algoritmos paralelos para esse problema, uma vez que ele corresponde a um
processo computacional muito custoso. No entanto, a maioria dos algoritmos propostos
para minerar tais regras seguem uma busca iterativa, que imp: íe a necessidade de
sincronização ao final de cada iteração, degradando o desempenho. Outra deficiência ... -
Mineração Incremental de Regras de Associação
Simposio Brasileiro de Arquitetura de Computadores e Processamento de Alto Desempenho
A utilização efetiva e contínua de técnicas de mineração de dados e di?cultada pela constante adição de novas transações, que resultam em bases de dados enormes, e por mudancas nos critérios utilizados na atividade de mineração, no caso de regras de associação, o suporte e a con?anca. O problema neste caso e que esse dinamismo pode invalidar algumas regras existentes e provocar o surgimento de novas regras relevantes. Neste artigo apresentamos PELICANO, um algoritmo e?ciente para geração…
A utilização efetiva e contínua de técnicas de mineração de dados e di?cultada pela constante adição de novas transações, que resultam em bases de dados enormes, e por mudancas nos critérios utilizados na atividade de mineração, no caso de regras de associação, o suporte e a con?anca. O problema neste caso e que esse dinamismo pode invalidar algumas regras existentes e provocar o surgimento de novas regras relevantes. Neste artigo apresentamos PELICANO, um algoritmo e?ciente para geração incremental de regras de associação, que se baseia apenas nos itemsets maximais frequentes e na ocorrência de itens em transações para atualizar a base de regras de associação. Os itemsets maximais são usados para realizar uma enumeração descendente de todos os itemsets frequentes, minimizando o número de conjuntos candidatos processados para a atualização dos itemsets maximais frequentes. PELICANO difere de outros algoritmos incrementais principalmente por permitir variações no valor do suporte mínimo e por acessar, precisamente uma vez, a base de dados com as novas transações, minimizando custos de entrada/saída. Avaliamos nosso algoritmo realizando minerações incrementais tanto em bases de dados sintéticas como reais, as quais ?caram até 15 vezes mais rápida usando PELICANO.
-
Modelagem Vetorial Estendida por Regras de Associação
Simposio Brasileiro de Banco de Dados
The goal of this work is to present an extension to the vector model that accounts for the
correlation among query terms, by using association rules, a popular data mining technique.
In Information Retrieval, the vector model allows retrieving a set of documents from a term-
based query, where both query terms and documents are vectors in a vector space.
Although the vector model has been used succesfully for decades, there are no practical
and ef? cient mechanisms that…The goal of this work is to present an extension to the vector model that accounts for the
correlation among query terms, by using association rules, a popular data mining technique.
In Information Retrieval, the vector model allows retrieving a set of documents from a term-
based query, where both query terms and documents are vectors in a vector space.
Although the vector model has been used succesfully for decades, there are no practical
and ef? cient mechanisms that account for correlations among query terms in each ... -
Using quantitative information for efficient association rule generation
Journal of the Brazilian Computer Society
The solution of the mining association rules problem in customer transactions was
introduced by Agrawal, Imielinski and Swami in 1993. Their approach was extended in
several directions such as adding or replacing the confidence and support by other
measures, or how to also account for quantitative attributes. In this paper we present an
algorithm that can be used in the context of several of the extensions provided in the
literature while preserving its performance, as…The solution of the mining association rules problem in customer transactions was
introduced by Agrawal, Imielinski and Swami in 1993. Their approach was extended in
several directions such as adding or replacing the confidence and support by other
measures, or how to also account for quantitative attributes. In this paper we present an
algorithm that can be used in the context of several of the extensions provided in the
literature while preserving its performance, as illustrated by a case study. Our approach is ... -
Paralelização de Geração de Regras de Associação
Simposio Brasileiro de Arquitetura de Computadores e Processamento de Alto Desempenho
Mineração de dados é uma área de pesquisa emergente, cujo objetivo principal é
extrair padrões e regras implícitos em banco de dados. Mui tos algoritmos para mineração
de regras de associação foram propostos. Entretanto, a pesquisa tem dado atenção
principalmente à algoritmos seqüenciais. Neste artigo apresentamos a paralelização de um
algoritmo para determinação de regras de associação, utilizando o paradigma de memória
compartilhada. Os resultados indicam que a nossa…Mineração de dados é uma área de pesquisa emergente, cujo objetivo principal é
extrair padrões e regras implícitos em banco de dados. Mui tos algoritmos para mineração
de regras de associação foram propostos. Entretanto, a pesquisa tem dado atenção
principalmente à algoritmos seqüenciais. Neste artigo apresentamos a paralelização de um
algoritmo para determinação de regras de associação, utilizando o paradigma de memória
compartilhada. Os resultados indicam que a nossa paralelização é escalável até oito ...
Reconhecimentos e prêmios
-
Best Paper Award
Simposio Brasileiro de Banco de Dados
Desde 1998, é escolhido o melhor artigo do SBBD, o qual recebe o prêmio José Mauro Volkmer de Castilho.
Idiomas
-
Portuguese
Nível nativo ou bilíngue
-
English
Nível avançado
Mais atividade de Bruno
-
_ontem foi dia de levar as startups do Órbi Conecta e do San Pedro Valley no Google. Com ingressos esgotados em menos de um dia, só prova o quanto a…
_ontem foi dia de levar as startups do Órbi Conecta e do San Pedro Valley no Google. Com ingressos esgotados em menos de um dia, só prova o quanto a…
Bruno Possas gostou
-
Ontem tivemos uma tarde inteiramente dedicada ao ecossistemas de startups de Belo Horizonte, no primeiro escritório da Google no Brasil! É sempre uma…
Ontem tivemos uma tarde inteiramente dedicada ao ecossistemas de startups de Belo Horizonte, no primeiro escritório da Google no Brasil! É sempre uma…
Bruno Possas gostou
-
Muito legal ter estado com a turma do Google hoje em BH discutindo inovação e tecnologia. As ferramentas de IA oferecidas são incríveis e as palavras…
Muito legal ter estado com a turma do Google hoje em BH discutindo inovação e tecnologia. As ferramentas de IA oferecidas são incríveis e as palavras…
Bruno Possas gostou
-
I had the pleasure of attending #GoogleForBrasil in Sao Paulo this week where we shared our latest investments and innovations in the country. I’m…
I had the pleasure of attending #GoogleForBrasil in Sao Paulo this week where we shared our latest investments and innovations in the country. I’m…
Bruno Possas gostou
-
Last week, after more than 6 years, I left my job at Google. I enjoyed my time at Google, learned a lot, met some great mentors and even better…
Last week, after more than 6 years, I left my job at Google. I enjoyed my time at Google, learned a lot, met some great mentors and even better…
Bruno Possas gostou
-
I only recently switched to an iPhone and discovered Apple’s default auto text replacement of ‘Omw’ to “On my way!”. What’s amusing is that this is…
I only recently switched to an iPhone and discovered Apple’s default auto text replacement of ‘Omw’ to “On my way!”. What’s amusing is that this is…
Bruno Possas gostou
Outros perfis semelhantes
Outras pessoas chamadas Bruno Possas em Brasil
-
Bruno Possas
Diretor e Consultor na GDOL. Especialista nas normas ISO e PBQP-H. Idealizador do Sistema SGQ, o mais completo sistema de gestão da qualidade do Brasil.
-
Bruno Possas
--
-
Bruno Possas
Auditor na Ministério do Trabalho e Emprego
-
Bruno Possas
Consultor de qualidade na GDOL SGQ
Mais 5 pessoas chamadas Bruno Possas fazem parte do LinkedIn em Brasil
Veja mais pessoas chamadas Bruno Possas