<?xml version="1.0" encoding="iso-8859-1"?>
<!-- generator="FeedCreator 1.7.2" -->
<rss version="2.0">
	<channel>
		<title>Joomla! powered Site</title>
		<description>Joomla! site syndication</description>
		<link>http://tomtuduc.info</link>
		<lastBuildDate>Mon, 01 Dec 2008 12:41:43 +0100</lastBuildDate>
		<generator>FeedCreator 1.7.2</generator>
		<image>
			<url>http://tomtuduc.info/images/M_images/joomla_rss.png</url>
			<title>Powered by Joomla!</title>
			<link>http://tomtuduc.info</link>
			<description>Joomla! site syndication</description>
		</image>
		<item>
			<title>Fraud detection with Complex Event Processing</title>
			<link>http://tomtuduc.info/content/view/68/</link>
			<description>
Opportunity and threats are two sides of the same bottom line. Threats
include frauds of all kinds, isolated and &quot;wholesales&quot; frauds. Fraud
detection is mostly about finding significant exceptions from normal
patterns. The data transformation in typical Business Intelligence data
warehouse need to be transformed for fraud detection algorithms. In
typical BI applications, data are gathered for predicting buying
patterns, converting patterns, or churning pattterns, i.e. what
campaigns produce the most conversions for certain market segments.
Here data are grouped into categories, or subgroups. A model can then
be used to predict the behaviors of the subgroup.

There are related Fraud metrics: detection, remedies, and preventions.
Fraud detection mainly uses statistical anomalies. Fraud prevention
employs &quot;what-ifs&quot; scenarios and anticipation of possible breaches.

Fraud detection and remedies shares similarities with information
security breach detection and remedies. For example, &quot;honey-pots&quot; are
used to lure potential violators. Fraud reduction metrics can guide the
process of selecting remedies and fraud detection methods. &quot;No stone is
left unturned&quot; is a useful notion in the process of forming possible
scenarios and eliminating non-useful hypotheses. On the Internet, not
only original IP addresses information are useful, but pattern
redirect, time of response can indicate suspicious activies.

Some possible technologies:
- Rule-based engines, i.e. Blade, Ilog
- Complex event processing, i.e. SqlStream, Coral8 (see a previous blog here)
- Neural nets.
- Tibco's SOA architecture using Joint Directors of Laboratories (JDL)
data fusion model 

The JDL can use statistical and data mining techniques including
classification (trees), association, correlation, clustering to produce
normalized event streams. From this, scoreCards are produced to show
possible fraudulent activities. A rules-based system can be used to
classify kinds of frauds. 

</description>
			<category>News - Latest</category>
			<pubDate>Tue, 27 May 2008 01:00:00 +0100</pubDate>
		</item>
		<item>
			<title>MySQL MyISAM versus InnoDB usage</title>
			<link>http://tomtuduc.info/content/view/67/</link>
			<description>
MySQL MyISAM versus InnoDB usage

MyISAM Table Advantage:
1. High speed logging (mod_log_sql for Apache) - Thousands of record
inserts/second.
2. MyISAM Merge for analyzing across logs - across similar tables -
ideal for logs statistical analysis
3. Listings, i.e. real estate websites, job websites, social networking
listings, product listings, comparison shopping listings, stock
listings.
4. MyISAM and compressed MyISAM (read-only) tables takes much less
space - perfect for read-only DVD.
5. MyISAM can be text search
6. Indexing is much faster.

 InnoDB advantage:

1. Any kind of monetary or account transaction where balances are
involved.
2. Referential integrity by using foreign keys. For example, a customer
table with a foreign key &quot;orderID&quot; can be used to remove a customer's
orders when this customer's record is deleted.
3. High read and write at the same time, i.e. stock quotes

Table conversion gotcha:
When MyISAM tables are converted to InnoDB, make sure all &quot;SELECT COUNT
(*) FROM aTable ...&quot; are addressed (or removed) because MyISAM tables
automatically has these counts where InnoDB tables has to scan the
rows.


For scheduled aggregation of data from many different sources where
wholesale cleansing of hundreds of millions of records, use MEMORY type
table for 2 orders of magnitudes faster performance.


</description>
			<category>News - Latest</category>
			<pubDate>Wed, 01 Oct 2008 01:00:00 +0100</pubDate>
		</item>
		<item>
			<title>SAS SQL Tutorials</title>
			<link>http://tomtuduc.info/content/view/66/</link>
			<description> 		 		 				 				 				 				 		 		    			SAS Table Joins Tutorial (http://www.scribd.com/doc/6444526/SAS-Table-Joins-Tutorial) - Upload a Document to Scribd (http://www.scribd.com/upload)	</description>
			<category>News - Latest</category>
			<pubDate>Thu, 14 Aug 2008 01:00:00 +0100</pubDate>
		</item>
		<item>
			<title>Comparing Analytica and DPL for financial modeling</title>
			<link>http://tomtuduc.info/content/view/65/</link>
			<description>		 		 				 				 				 				 		 		    			Comparing Analytica and DPL for Financial Modeling (http://www.scribd.com/doc/6213934/Comparing-Analytica-and-DPL-for-Financial-Modeling) - Upload a Document to Scribd (http://www.scribd.com/upload)</description>
			<category>News - Latest</category>
			<pubDate>Tue, 02 Sep 2008 01:00:00 +0100</pubDate>
		</item>
		<item>
			<title>How many Facebook users uploaded their pictures?</title>
			<link>http://tomtuduc.info/content/view/64/</link>
			<description>Question: How many 
Facebook users uploaded their pictures?
Answer: roughly less than
half as of September 18, 2008.
Methodology: 
The pivot shows there are more users not having a picture than users
with pictures in every locale (0 means no pic, 1 means has a pic)
Samples: 6508 users from Facebook database table USER
Code: 
$FQL2=&quot;SELECT
first_name, last_name, pic_square, locale from user WHERE uid=$xxx&quot;;
$resultset2 = $facebook-&amp;gt;api_client-&amp;gt;fql_query($FQL2);

Userid are created by a hash function in
Facebook. The uid $xxx in the sample set ranges from 1050170000 to
1450180000

</description>
			<category>News - Latest</category>
			<pubDate>Fri, 01 Aug 2008 01:00:00 +0100</pubDate>
		</item>
	</channel>
</rss>
