12:30-13:30 Countering Web Spam with Credibility-Based Link Analysis
Prof. Ling Liu (Georgia Institute of Technology)
13:30-13:35 休憩
13:35-14:35 Automated Application Deployment and Staging using Code
Generation
in Elba Project
Prof. Calton Pu (Georgia Institute of Technology)
14:35-14:40 休憩
14:40-15:40 Spatial Alarms: Scalable Architecture and Energy-efficient
Processing Techniques
Prof. Ling Liu (Georgia Institute of Technology)
1.Countering Web Spam with Credibility-Based Link Analysis
Prof. Ling Liu (Georgia Institute of Technology)
As the Web has grown and increasingly become the primary portal
for sharing information and supporting online commerce, there has been
a rise in efforts to manipulate (or spam) how users view and interact
with the Web. A prominent example of this Web spam is the targeted
manipulation of link-based ranking systems to increase the rankings
(and, hence, the amount of user traffic) of particular Web pages,
regardless of the intrinsic merits of those pages. Interestingly,
popular link-based Web ranking algorithms like PageRank, HITS, and
TrustRank rely on a fundamental assumption that the quality of a page
and the quality of a page's links are the same.
In this talk, we analyze link-based web spasm and propose the concept of
link credibility and argue that the intrinsic quality of a page in terms
of its content should be distinguished from its intrinsic link credibility.
Concretely, we propose several techniques for semi-automatically assessing
link credibility for all Web pages, since manually determining the link
redibility of every page on the Web is infeasible. Second, we propose a
novel credibility-based Web ranking algorithm -- CredibleRank. The new
ranking algorithm incorporates link credibility information directly into
the quality assessment of each page on the Web. CredibleRank effectively
counters the influence of link hijacking, honeypots, and other spammer
attacks that seek to corrupt link-based algorithms. Finally, we introduce
concrete metrics for measuring the spam-resilience properties of ranking
algorithms and show how the proposed credibility-based ranking algorithm
outperforms both PageRank and TrustRank over real-world Web data of over
100 million pages.
Biography:
Dr. Ling Liu is an Associate Professor in the College of Computing at
Georgia
Institute of Technology. There she directs the research programs in
Distributed
Data Intensive Systems Lab (DiSL), examining performance, security,
privacy,
and data management issues in building large scale network centric and data
intensive systems and applications. Dr. Liu and the DiSL research group
have
been working on various aspects of distributed data intensive systems,
ranging
from decentralized overlay networks, mobile computing and location based
services,
sensor network and event stream processing, to service oriented
computing and
architectures. She has published over 200 international journal and
conference
articles in the areas of Internet Computing systems, Internet data
management,
distributed systems, and information security. Her research group has
produced
a number of open source software systems, among which the most popular ones
include WebCQ, XWRAPElite, PeerCrawl. She has chaired a number of
conferences
as a PC chair, vice PC chair, or a general chair, including IEEE
International
Conference on Data Engineering (ICDE 2004, ICDE 2006, ICDE 2007), IEEE
International Conference on Distributed Computing (ICDCS 2006), IEEE
International Conference on Web Services (ICWS 2004), CreateNet-ICST
Collaborative Computing Conference (CollaborateCom 2005, 2006), ACM
International Conference on Knowledge and Information Management (CIKM
2000).
Dr. Liu is currently on the editorial board of several international
journals,
including IEEE Transactions on Knowledge and Data Engineering,
International
Journal of Very Large Database systems (VLDBJ), International Journal of
Peer-to-Peer Networking and Applications (Springer), International
Journal of
Web Services Research, Wireless Network Journal (WINET). Dr. Liu is the
recipient of the best paper award of ICDCS 2003, the best paper award of
WWW 2004, a recipient of 2005 Pat Goldberg Memorial Best Paper Award, and
a recipient of IBM faculty award in 2003 and 2006. Dr. Liu’s research is
primarily sponsored by NSF, DARPA, DoE, and IBM.
2. Automated Application Deployment and Staging using Code Generation
in Elba Project
Prof. Calton Pu (Georgia Institute of Technology)
ABSTRACT
Large N-Tier applications running in data centers have complex deployment
requirements and dependencies that change frequently. The increasing
complexity and scalability requirements of such applications demand
automated design, testing, deployment and monitoring of applications.
The goal of Elba project is creating automated staging and testing of
complex enterprise systems before deployment to production. Automating
the staging process lowers the cost of testing applications and improves
its reliability. Elba software tools extract test parameters from
production specifications, such as SLAs, and deployment specifications,
and via the Mulini generator, create staging plans for the application.
A benchmark application, the TPC-W, is used to validate the generated
configuration. Simple learning tools identify system bottlenecks and
refine application deployments based on performance and cost.
Biography :
Calton Pu was born in Taiwan and grew up in Brazil. He received his PhD
from University of Washington in 1986 and served on the faculty of Columbia
University and Oregon Graduate Institute. Currently, he is holding the
position of Professor and John P. Imlay, Jr. Chair in Software at the
College of Computing, Georgia Institute of Technology. He is currently
working on three areas. First, he is using automated code generation
techniques to automate and ensure the correct deployment of large scale
N-tier applications in the Elba project. Second, he is investigating
software and statistical techniques to defend against Denial of Information
attacks in areas such as email and web spam. Third, he is working on the
application of specialization and other techniques to ensure the
reliability,
trust, and security of system and application software. He has been the
principal investigator of the Infosphere, Synthetix, and Immunix projects,
with technical contributions such as Epsilon Serializability, Reflective
Transaction Framework, and Continual Queries over the Internet. His
collaborations include applications of these techniques in scientific
research on macromolecular structure data, weather data, and environmental
data, as well as in industrial settings. He has published more than 50
journal papers and book chapters, 150 conference and refereed workshop
papers, and served on more than 100 program committees, including the
co-PC chairs of SRDS'95, ICDE・9, COOPIS・2, SRDS・3, and co-general chair
of ICDE'97, CIKM'01, ICDE・6.
3. Spatial Alarms: Scalable Architecture and Energy-efficient
Processing Techniques
Prof. Ling Liu (Georgia Institute of Technology)
Time based alarms are used by many on a daily basis. Spatial alarms
extend the
very same idea to location based triggers, which are fired whenever a
mobile
user enters the spatial region of the location alarms. Spatial alarms
provide
critical capabilities for many mobile location based applications ranging
from personal assistants, inventory tracking to industrial safety warning
systems. In this talk we present a scalable architecture for energy
efficient
processing of spatial alarms, while maintaining low computation and storage
costs on mobile clients. Two important architectural designs are
considered:
Client-based and Sever-based solutions. In client based architecture, we
focus on efficient techniques for processing spatial alarms while
minimizing
energy consumption on mobile clients. In Sever-based architecture, we focus
on systematic methods for indexing spatial alarms, enabling efficient
processing
of public alarms, group-shared spatial alarms and private spatial
alarms. In
this talk we will first introduce the concept of safe distance to reduce
the number of unnecessary mobile client wakeups for spatial alarm
evaluation,
enabling mobile clients to sleep for longer intervals of time in the
presence
of active spatial alarms. We show that our safe distance techniques can
significantly minimize the energy consumption on mobile clients compared to
periodic wakeups while preserving the accuracy and timeliness of spatial
alarms. Second, we present a suite of techniques for minimizing the number
of location triggers to be checked for spatial alarm evaluation upon each
wakeup. This further reduces the computation cost and energy expenditure on
mobile clients. We evaluate the scalability and energy-efficiency of our
approach using a road network simulator. Our spatial alarms middleware
architecture offers significant improvements on battery lifetime of mobile
clients, while maintaining high quality of spatial alarm services,
especially
compared to the conventional approach of periodic wakeup and checking all
alarms upon wakeup. |