Diego Valle-Jones's Bloghttps://blog.diegovalle.net/2014-11-10T00:00:00+01:00HoyodeCrimen.com - Crime information for the Distrito Federal2014-11-10T00:00:00+01:00Diego Valle-Jonestag:blog.diegovalle.net,2014-11-10:2014/11/hoyodecrimencom-crime-information-for.html<div class="post-body entry-content" itemprop="articleBody">
Crime information for the Federal District now has its own website with updated data<br/>
<br/>
<a href="https://hoyodecrimen.com/en">https://hoyodecrimen.com/en</a>/ - English version<br/>
<a href="https://hoyodecrimen.com/">https://hoyodecrimen.com/</a> - Spanish version<br/>
<br/>
I’ve also added a <a href="https://hoyodecrimen.com/en/trends">trends</a> section where you can look up which cuadrantes experienced a rise in crime<br/>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="/images/blogger_images/3.bp.blogspot.com_-_wyTP_VtNUY_VGC4js4s5sI_AAAAAAAAIW0_IbXYaZdg2Hk_s1600_https%2B%2B%2Bhoyodecrimen.com%2Ben%2Btrends.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="236" src="/images/blogger_images/3.bp.blogspot.com_-_wyTP_VtNUY_VGC4js4s5sI_AAAAAAAAIW0_IbXYaZdg2Hk_s1600_https%2B%2B%2Bhoyodecrimen.com%2Ben%2Btrends.png" width="400"/></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">There seems to a be big problem with car robberies near where the new airport will be built</td></tr>
</tbody></table>
<br/>
<br/>
Since the crime data comes from <span class="caps">FOIA</span> requests to the <span class="caps">SSPDF</span> (Mexico City Police), I’ve added free email announcements to keep you informed of when new data is available:<br/>
<br/>
<a href="http://eepurl.com/71l2n">Notifications in English</a><br/>
<a href="http://eepurl.com/7XKNT">Notifications in Spanish</a><br/>
<br/>
<br/>
There’s even an <a href="https://hoyodecrimen.com/api/"><span class="caps">API</span></a> with lots of cools stuff:<br/>
<a name="more"></a><br/>
<table border="1" class="docutils" style="background-color: white; border-collapse: collapse; border: 0px; color: #3e4349; font-family: Arial, sans-serif; font-size: 14px;"><thead valign="bottom">
<tr class="row-odd"><th class="head" style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;">Service</th><th class="head" style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;">Action</th><th class="head" style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;"><span class="caps">URI</span></th></tr>
</thead><tbody valign="top">
<tr class="row-even"><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;">Point in Polygon</td><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;">Given a longitude and latitude return the corresponding cuadrante and sector</td><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;"><div class="first last line-block" style="margin-bottom: 1em; margin-top: 1em;">
<div class="line">
<br/></div>
</div>
</td></tr>
<tr class="row-odd"><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;">Time Series</td><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;">Crime counts ordered by month of occurrence for a cuadrante or sector</td><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;"><div class="first last line-block" style="margin-bottom: 1em; margin-top: 1em;">
<div class="line">
<br/></div>
</div>
</td></tr>
<tr class="row-even"><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;">List Cuadrantes or Sectores</td><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;">Sum of crimes that occurred in a cuadrante or sector for a specified period of time</td><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;"><div class="first last line-block" style="margin-bottom: 1em; margin-top: 1em;">
<div class="line">
<br/></div>
</div>
</td></tr>
<tr class="row-odd"><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;">Top Most Violent</td><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;">A list of the cuadrantes and sectors with the highest rates (sectores), crime counts (cuadrantes) or change in crime counts (cuadrantes)</td><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;"><div class="first last line-block" style="margin-bottom: 1em; margin-top: 1em;">
<div class="line">
<br/></div>
</div>
</td></tr>
<tr class="row-even"><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;"><span class="caps">DF</span> data</td><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;">A time series of the sum of all crimes that occurred in the Federal District</td><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;"><div class="first last line-block" style="margin-bottom: 1em; margin-top: 1em;">
<div class="line">
<br/></div>
</div>
</td></tr>
<tr class="row-odd"><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;">Enumerate</td><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;">Get a list of the names of all cuadrantes, sectores or crimes</td><td style="border-bottom-color: rgb(170, 170, 170); border-bottom-style: solid; border-width: 0px 0px 1px; padding: 1px 8px 1px 5px;"><div class="first last line-block" style="margin-bottom: 1em; margin-top: 1em;">
<div class="line">
<br/></div>
</div>
</td></tr>
</tbody></table>
<br/>
<span class="caps">P.S.</span> Yes, I know the Spanish translation is not complete, but I’ll finish it someday<br/>
<span class="caps">P.P.S.</span> The code is available at <a href="https://github.com/diegovalle/hoyodecrimen.api">GitHub</a>. It was built with Python + Flask + PostgreSQL + PotsGIS + Redis + D3 and an unholy combination of jQuery and AngularJS
<div style="clear: both;"></div>
</div>Analysis of the UNAM’s entrance exam2014-04-07T00:00:00+02:00Diego Valle-Jonestag:blog.diegovalle.net,2014-04-07:2014/04/analisis-exam-de-admision-unam.html<div class="post-body entry-content" itemprop="articleBody">
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/1.bp.blogspot.com_-W024CTuRmSA_UzOZeyqzFpI_AAAAAAAAGlY_sHWd5TcNwow_s1600_Physical+Sciences%252C+Mathematics+and+Engineering-majors.svg.png" imageanchor="1" ><img border="0" height="300" src="/images/blogger_images/1.bp.blogspot.com_-W024CTuRmSA_UzOZeyqzFpI_AAAAAAAAGlY_sHWd5TcNwow_s1600_Physical+Sciences%252C+Mathematics+and+Engineering-majors.svg.png" width="540"/></a></div>
The <span class="caps">UNAM</span> is Mexico’s biggest and most important university. To enter it students must either take an exam or graduate from a high school run by the <span class="caps">UNAM</span> in less than 4 years with a grade point average of at least 70% (although some majors like medicine require 90% for <i>pase directo</i>). The admission exam is given twice a year, in February and June, and any student from any high school with at least a grade point average of 70% can take it. If the student meets the requirements for entering the <span class="caps">UNAM</span>, passing the exam guarantees him admission. The exam has 120 questions.<br/>
<a name="more"></a><br/>
Depending on the student’s choice of major, the admission exam emphasizes one of four basic areas of study. Let’s say you want to study math: in addition to being tested on the topics every high school student is supposed to know, you’ll get a couple of extra questions about integration by parts; if you want to study biology the extra questions will be about the the Krebs cycle; if you want to study philosophy the exam will probably include extra references to the great works of literature; and if you want to study a social science you’ll be asked about the differences between a <a href="http://eldeforma.com/2014/03/22/unam-ofrecera-la-carrera-de-licenciatura-en-filantropia/">grand macchiato and a caffè latte</a>.<br/>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="http://www.diegovalle.net/charts/unam/all-unam.html"><img border="0" height="640" src="/images/blogger_images/1.bp.blogspot.com_-qb0xz0yZ8A8_UzORAIretuI_AAAAAAAAGk4_cP_dRh5EHxA_s1600_SankeyID612f45f26c41.png" style="margin-left: auto; margin-right: auto;" width="392"/></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Click on the chart to visit the interactive version</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://www.diegovalle.net/charts/unam/all-unam.html" imageanchor="1" ></a><br class="Apple-interchange-newline"/></div>
<a href="http://www.diegovalle.net/charts/unam/all-unam.html" imageanchor="1" ></a><br/>
Apart from <span class="caps">UNAM</span>’s main (and most prestigious) campus of Ciudad Universitaria (<span class="caps">CU</span>), the university has several satellite campuses in the Mexico City metro area as well as many others across Mexico. In this post I will only analyse those located in Mexico City, and only for the scholarized system (<i>sistema escolarizado</i>) where the students have to actually sit in a classroom (the <span class="caps">UNAM</span> also offers remote <span class="caps">TV</span>/Internet classes and under an open system).<span style="background-color: white; color: #252525; font-family: 'Helvetica Neue', Helvetica, Arial, sans-serif; font-size: 14px; line-height: 21px;"> </span>The admission exam to the <span class="caps">UNAM</span> is quite competitive and only a small percentage of those who apply actually get in.<br/>
<br/>
<table><thead>
<tr> <th>Date</th> <th>Location</th> <th>Percentage <br/>
admitted</th> <th>Applied</th> <th>Completed<br/>
Test </th> <th>Admitted</th> </tr>
</thead> <tbody>
<tr> <td>2011-06 </td> <td><span class="caps">CU</span> </td> <td>8.6</td> <td>30,615 </td> <td>27,515 </td> <td>2,626 </td> </tr>
<tr> <td>2011-06 </td> <td>Not <span class="caps">CU</span> </td> <td>9.1</td> <td>32,215 </td> <td>29,667 </td> <td>2,916 </td> </tr>
<tr> <td>2012-02 </td> <td><span class="caps">CU</span> </td> <td>5.2</td> <td>61,262 </td> <td>55,793 </td> <td>3,192 </td> </tr>
<tr> <td>2012-02 </td> <td>Not <span class="caps">CU</span> </td> <td>6.4</td> <td>56,012 </td> <td>52,287 </td> <td>3,597 </td> </tr>
<tr> <td>2012-06 </td> <td><span class="caps">CU</span> </td> <td>5.1 </td> <td>30,944 </td> <td>28,084 </td> <td>1,573 </td> </tr>
<tr> <td>2012-06 </td> <td>Not <span class="caps">CU</span> </td> <td>10.1</td> <td>32,741 </td> <td>30,354 </td> <td>3,299 </td> </tr>
<tr> <td>2013-02 </td> <td><span class="caps">CU</span> </td> <td>5.4</td> <td>63,562 </td> <td>59,425 </td> <td>3,424 </td> </tr>
<tr> <td>2013-02 </td> <td>Not <span class="caps">CU</span> </td> <td>6.7</td> <td>56,348 </td> <td>53,547 </td> <td>3,800 </td> </tr>
<tr> <td>2013-06 </td> <td><span class="caps">CU</span> </td> <td>6.3 </td> <td>29,872 </td> <td>26,052 </td> <td>1,868 </td> </tr>
<tr> <td>2013-06 </td> <td>Not <span class="caps">CU</span> </td> <td>13.8 </td> <td>33,744 </td> <td>30,403 </td> <td>4,656 </td> </tr>
</tbody> </table>
<br/>
This year Harvard sent out 2,023 offers out of 34,295 applications for an admission rate of 5.9%, which is exactly the same overall admission rate as the <span class="caps">UNAM</span> (<span class="caps">CU</span>) from June 2011 to June 2013 (though, of course, Harvard has much much <a href="http://www.eluniversal.com.mx/notas/860687.html">lower standards</a> in who it <a href="http://www.hks.harvard.edu/news-events/news/press-releases/felipe-calderon-appointment">admits</a>).<br/>
<br/>
Admissions by major<br/>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="http://www.diegovalle.net/charts/unam/major-major.html" imageanchor="1" style="margin-left: auto; margin-right: auto; text-align: center;"><img border="0" height="640" src="/images/blogger_images/1.bp.blogspot.com_-gpIWyXhnrwY_UzOhxbR4AyI_AAAAAAAAGl0_CZPGRcMl4Y0_s1600_MEDICO+CIRUJANO+CU+31+219+persons.png" width="544"/></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Click on the chart to visit the interactive version</td></tr>
</tbody></table>
<br/>
Admissions by area<br/>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="http://www.diegovalle.net/charts/unam/area-area.html"><img border="0" height="279" src="/images/blogger_images/2.bp.blogspot.com_-bwjiCvN-jC4_UzOm2SNv8YI_AAAAAAAAGmE_T-sS6YTeDPw_s1600_Sankeyarea.png" style="margin-left: auto; margin-right: auto;" width="320"/></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Click on the chart to visit the interactive version</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/2.bp.blogspot.com_-bwjiCvN-jC4_UzOm2SNv8YI_AAAAAAAAGmE_T-sS6YTeDPw_s1600_Sankeyarea.png" imageanchor="1" ></a><br class="Apple-interchange-newline"/></div>
<a href="/images/blogger_images/2.bp.blogspot.com_-bwjiCvN-jC4_UzOm2SNv8YI_AAAAAAAAGmE_T-sS6YTeDPw_s1600_Sankeyarea.png" imageanchor="1" ></a><br/>
<div>
I scrapped all admission results from the <a href="https://www.dgae.unam.mx/noticias/primingr/primingr.html"><span class="caps">UNAM</span>’s website</a> from June 2011 to June 2013. There were some problems with the data at the <span class="caps">UNAM</span>’s end since the listings didn’t always match the summary statistics included in the web pages. If you visit the results for <a href="https://servicios.dgae.unam.mx/Junio2012/resultados/4/4337005.html">Historia del Arte</a> you’ll see that the summary statistics claim that zero students applied to take the test, but the listing includes 3 students (given the <a href="http://www.diegovalle.net/charts/unam/scores.html">test scores of infomatics students</a> I’m not surprised by the mistakes). Anyways, this kind of mismatch was rare and didn’t involve that many students. I took the actual listing to be definitive. In addition the results sometimes include data from students whose result is <i>Cita para aclarar situación escolar</i>, I simply interpreted this as being a <a href="http://www.statmethods.net/input/missingdata.html">missing value</a>.</div>
<div>
<br/></div>
<div>
Physical Sciences, Mathematics and Engineering</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/3.bp.blogspot.com_-OR_nW6bpjw4_U0HS18xq6pI_AAAAAAAAGmc_pnGpJw045Mc_s1600_Physical+Sciences,+Mathematics+and+Engineering-majors.svg.png" imageanchor="1" ><img border="0" height="182" src="/images/blogger_images/3.bp.blogspot.com_-OR_nW6bpjw4_U0HS18xq6pI_AAAAAAAAGmc_pnGpJw045Mc_s1600_Physical+Sciences,+Mathematics+and+Engineering-majors.svg.png" width="320"/></a></div>
<br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/3.bp.blogspot.com_-HS67EjH3MBw_U0HS16RWYhI_AAAAAAAAGmg_fJTIy2vhs64_s1600_Physical+Sciences%252C+Mathematics+and+Engineering-faculty.svg.png" imageanchor="1" ><img border="0" height="213" src="/images/blogger_images/3.bp.blogspot.com_-HS67EjH3MBw_U0HS16RWYhI_AAAAAAAAGmg_fJTIy2vhs64_s1600_Physical+Sciences%252C+Mathematics+and+Engineering-faculty.svg.png" width="320"/></a></div>
<br/>
Biological Sciences and Health<br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/4.bp.blogspot.com_-6a24UiCR57k_U0HS-DFMkRI_AAAAAAAAGnA_ZtZWTLa7N8Y_s1600_Biological+Sciences+and+Health-majors.svg.png" imageanchor="1" ><img border="0" height="182" src="/images/blogger_images/4.bp.blogspot.com_-6a24UiCR57k_U0HS-DFMkRI_AAAAAAAAGnA_ZtZWTLa7N8Y_s1600_Biological+Sciences+and+Health-majors.svg.png" width="320"/></a></div>
<br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/4.bp.blogspot.com_-UQP1ST7BeKM_U0HS-Pa9OmI_AAAAAAAAGm8_2WykQkhbB4Q_s1600_Biological+Sciences+and+Health-faculty.svg.png" imageanchor="1" ><img border="0" height="213" src="/images/blogger_images/4.bp.blogspot.com_-UQP1ST7BeKM_U0HS-Pa9OmI_AAAAAAAAGm8_2WykQkhbB4Q_s1600_Biological+Sciences+and+Health-faculty.svg.png" width="320"/></a></div>
<br/>
Social Sciences<br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/4.bp.blogspot.com_-zCkSPBDiIVA_U0HS-178WzI_AAAAAAAAGnQ_73Fnf2TrHV0_s1600_Social+Sciences-majors.svg.png" imageanchor="1" ><img border="0" height="182" src="/images/blogger_images/4.bp.blogspot.com_-zCkSPBDiIVA_U0HS-178WzI_AAAAAAAAGnQ_73Fnf2TrHV0_s1600_Social+Sciences-majors.svg.png" width="320"/></a></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/4.bp.blogspot.com_-XqDGiuVZi_Y_U0HS-pvUn5I_AAAAAAAAGnI_pdnibLdci9w_s1600_Social+Sciences-faculty.svg.png" imageanchor="1" ><img border="0" height="213" src="/images/blogger_images/4.bp.blogspot.com_-XqDGiuVZi_Y_U0HS-pvUn5I_AAAAAAAAGnI_pdnibLdci9w_s1600_Social+Sciences-faculty.svg.png" width="320"/></a></div>
<br/>
<br/>
Humanities and Arts<br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/1.bp.blogspot.com_-NocvMsnOvPw_U0HS-vaJRkI_AAAAAAAAGnU_5M0TARxL9no_s1600_Humanities+and+Arts-majors.svg.png" imageanchor="1" ><img border="0" height="182" src="/images/blogger_images/1.bp.blogspot.com_-NocvMsnOvPw_U0HS-vaJRkI_AAAAAAAAGnU_5M0TARxL9no_s1600_Humanities+and+Arts-majors.svg.png" width="320"/></a></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/4.bp.blogspot.com_-RqKYkhbGyI8_U0HS95_XDaI_AAAAAAAAGm0_LdRWvFsCyls_s1600_Humanities+and+Arts-faculty.svg.png" imageanchor="1" ><img border="0" height="213" src="/images/blogger_images/4.bp.blogspot.com_-RqKYkhbGyI8_U0HS95_XDaI_AAAAAAAAGm0_LdRWvFsCyls_s1600_Humanities+and+Arts-faculty.svg.png" width="320"/></a></div>
<br/>
Highest scoring majors in each area<br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/3.bp.blogspot.com_-jK_laWGXNCs_U0HdbjuBuQI_AAAAAAAAGoQ__u6vrA_ajyU_s1600_top-majors.svg.png" imageanchor="1" ><img border="0" height="213" src="/images/blogger_images/3.bp.blogspot.com_-jK_laWGXNCs_U0HdbjuBuQI_AAAAAAAAGoQ__u6vrA_ajyU_s1600_top-majors.svg.png" width="320"/></a></div>
<br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/3.bp.blogspot.com_-svPjq-rOAxU_U0HjnyO4kwI_AAAAAAAAGog_UqVXyPj6z3w_s1600_percent-admitted-top.png" imageanchor="1" ><img border="0" height="213" src="/images/blogger_images/3.bp.blogspot.com_-svPjq-rOAxU_U0HjnyO4kwI_AAAAAAAAGog_UqVXyPj6z3w_s1600_percent-admitted-top.png" width="320"/></a></div>
<br/>
Changes in median admission scores from exam to exam (first differences)<br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/1.bp.blogspot.com_-7z8jZsCxVmQ_U0HUF0a-UqI_AAAAAAAAGnc_uRsbct3d56U_s1600_change-in-trend.svg.png" imageanchor="1" ><img border="0" height="213" src="/images/blogger_images/1.bp.blogspot.com_-7z8jZsCxVmQ_U0HUF0a-UqI_AAAAAAAAGnc_uRsbct3d56U_s1600_change-in-trend.svg.png" width="320"/></a></div>
Median scores each year<br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/1.bp.blogspot.com_-3YgR1uu5X2Q_U0HU4hV2hLI_AAAAAAAAGnk_6RYzSWUSrfQ_s1600_median-admit.svg.png" imageanchor="1" ><img border="0" height="213" src="/images/blogger_images/1.bp.blogspot.com_-3YgR1uu5X2Q_U0HU4hV2hLI_AAAAAAAAGnk_6RYzSWUSrfQ_s1600_median-admit.svg.png" width="320"/></a></div>
<br/>
(a standard deviation is 17 points)<br/>
<br/>
One big problem with the admission process at the <span class="caps">UNAM</span> is that it doesn’t use <a href="http://en.wikipedia.org/wiki/Stable_marriage_problem">Gale-Shapley</a> and thus students have an incentive to not reveal their true preferences:<br/>
<br/>
Imagine a student who is deciding between studying <i>ingeniería mecatrónica</i> and<i> ingeniería mecánica</i>, because mechatronics is very hard to get into, and students only get to apply to one major at each exam, he decides to list mechanical engineering as his choice even when there was a chance (albeit not as big) that he would have been admitted to mechatronics.<br/>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="http://www.diegovalle.net/charts/unam/wasted.html" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="213" src="/images/blogger_images/1.bp.blogspot.com_-nzYPzDdZQGQ_U0HdAmOf-zI_AAAAAAAAGoI_ikXUBkJOo6s_s1600_mecatronica.svg.png" width="320"/></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Click on the chart to visit the interactive version</td></tr>
</tbody></table>
<br/>
Another consequence of not using Gale-Shapley is that <b>some students who do list their true preference are rejected in favor of lower scoring students</b> (this is probably a really big deal).<br/>
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td><a href="http://www.diegovalle.net/charts/unam/wasted.html" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="213" src="/images/blogger_images/3.bp.blogspot.com_-eh_wRlVfro0_U0Hc19TdSuI_AAAAAAAAGoA_gC8C0dI6sD4_s1600_rejected-area.svg.png" width="320"/></a></td></tr>
<tr><td class="tr-caption" style="font-size: 13px;">Click on the chart to visit the interactive version<br/>
<div>
<br/></div>
</td></tr>
</tbody></table>
Imagine a student who lists his or her true preference but who would be willing to entertain studying at an alternative campus with a lower admission score (international relations at <span class="caps">CU</span> vs international relations at <span class="caps">FES</span> Aragón) or a similar major with lower admittance requirements (international relations vs political science for example) who is rejected from his first choice but obtains a test score higher than the minimum requirement for his alternative.<br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/3.bp.blogspot.com_-AzHg0DKTJZM_U0HaOjR06NI_AAAAAAAAGn0_tU1ZxSjTqo4_s1600_percent-admit.svg.png" imageanchor="1" ><img border="0" height="200" src="/images/blogger_images/3.bp.blogspot.com_-AzHg0DKTJZM_U0HaOjR06NI_AAAAAAAAGn0_tU1ZxSjTqo4_s1600_percent-admit.svg.png" width="320"/></a></div>
We can see in the above chart how the percentage of students admitted increases at the lower requirement campuses during the June exam, but there is no such increase at the main <span class="caps">CU</span> campus. This is probably a consequence of students who had been rejected after the February exam having another go at being admitted at the lower requirement satellite campuses.<br/>
<br/>
The easiest way to remedy this would be to copy the <a href="http://blog.diegovalle.net/2013/07/the-best-high-schools-in-mexico-city.html"><span class="caps">COMIPEMS</span></a> exam —which the <span class="caps">UNAM</span> helped design— and allow students to list more than one major/campus when applying. Obviously there are a lot of variations and complications the <span class="caps">UNAM</span> could use to make its admission process better, for example they could start admitting students from different areas than their first choice major if they had really high test scores. I’m sure the <strike>applied math and computing</strike> international relations students are up to the task of designing an admittance mechanism that satisfies Gale-Shapley.<br/>
<br/>
There’s also a positive correlation between average test scores and the starting salaries of university graduates (of any university in Mexico City, not just the <span class="caps">UNAM</span>) in a poll conducted by Reforma in Mexico City.<br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/2.bp.blogspot.com_-gSA--D5fK5g_U0H_MmUgySI_AAAAAAAAGo4_SUCFV_z7LEs_s1600_score_vs_salary.png" imageanchor="1" ><img border="0" height="213" src="/images/blogger_images/2.bp.blogspot.com_-gSA--D5fK5g_U0H_MmUgySI_AAAAAAAAGo4_SUCFV_z7LEs_s1600_score_vs_salary.png" width="320"/></a></div>
<br/>
<div class="separator" style="clear: both; text-align: center;">
</div>
<br/>
<span class="caps">P.S.</span> I bet the <a href="http://www.diegovalle.net/charts/unam/treemap-unam.html">Escuela Nacional de Artes Plásticas</a> is full of <a href="http://www.npr.org/blogs/money/2014/03/18/289013884/who-had-richer-parents-doctors-or-arists">rich kids</a><br/>
<span class="caps">P.P.S.</span> Visit the <a href="http://www.diegovalle.net/charts/unam/scores.html">companion website</a> full of interactive charts<br/>
<span class="caps">P.P.P.S.</span> <a href="https://github.com/diegovalle/unam">Source code</a><br/>
<div>
<span style="color: #333333; font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="background-color: whitesmoke; font-size: 14px; line-height: 20px;"> </span></span></div>
<div>
<span style="color: #333333; font-family: Helvetica Neue, Helvetica, Arial, sans-serif;"><span style="background-color: whitesmoke; font-size: 14px; line-height: 20px;"> </span></span></div>
<div>
</div>
<div style="clear: both;"></div>
</div>State population datasets 1990-20302013-08-28T00:00:00+02:00Diego Valle-Jonestag:blog.diegovalle.net,2013-08-28:2013/08/state-population-datasets-1990-2030.html<div class="post-body entry-content" itemprop="articleBody">
<div>
<br/>
Cleaned up Mexican <a href="https://github.com/diegovalle/conapo-2010">population estimates</a> by five-year age groups and gender 1990-2030 (mid-year) at the state level:</div>
<div>
<br/>
<ul>
<li><a href="https://github.com/diegovalle/conapo-2010/blob/master/clean-data/state-population.csv">Total population by sex</a></li>
<li><a href="https://github.com/diegovalle/conapo-2010/blob/master/clean-data/state-population-age-groups.csv">Total population by sex and five-year age groups</a></li>
</ul>
</div>
<br/>
Original data from the <a href="http://www.conapo.gob.mx/es/CONAPO/Proyecciones"><span class="caps">CONAPO</span> website</a>
<div style="clear: both;"></div>
</div>Strengths and weaknesses of crime data in Mexico2011-02-10T00:00:00+01:00Diego Valle-Jonestag:blog.diegovalle.net,2011-02-10:2011/02/strengths-and-weaknesses-of-crime-data.html<div class="post-body entry-content" itemprop="articleBody">
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/2.bp.blogspot.com__q3Caf3YFFAs_TVIBuk-dICI_AAAAAAAAEo8_Qa0sxpXnwt8_s1600_inegi-vs-snsp-recent.png" imageanchor="1" ><img border="0" height="224" src="/images/blogger_images/2.bp.blogspot.com__q3Caf3YFFAs_TVIBuk-dICI_AAAAAAAAEo8_Qa0sxpXnwt8_s320_inegi-vs-snsp-recent.png" width="320"/></a></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
With so much data pertaining to the drug war released recently it’s hard to keep track of it all. And as with all things in life there are different pros and cons associated with each of the datasets: The homicide data from the police (<span class="caps">SNSP</span>), the homicide data from the vital statistics (<span class="caps">INEGI</span>), and the different estimates of drug war related deaths from Reforma, Milenio, and the database of homicides presumed to have been committed by organized crime.<br/>
<br/>
<a name="more"></a><h4>
Homicide data from the <span class="caps">SNSP</span></h4>
This data is based on police reports. The homicide rates calculated with it used to be higher until 2008 when they suddenly started being lower. <br/>
<div style="text-align: center;">
<a href="/images/blogger_images/2.bp.blogspot.com__q3Caf3YFFAs_TVIBtw2PWQI_AAAAAAAAEo4_RMXaYsadivI_s1600_inegi-vs-snsp97-10.png" ><img border="0" height="228" src="/images/blogger_images/2.bp.blogspot.com__q3Caf3YFFAs_TVIBtw2PWQI_AAAAAAAAEo4_RMXaYsadivI_s320_inegi-vs-snsp97-10.png" width="320"/></a> </div>
<b><i>Cons</i></b>
<br/>
<ul>
<li><b>The numbers in the database correspond to the
number of police reports (“averiguaciones previas”) for the crime of homicide, not to he number of
dead bodies</b>. The reports may contain than one victim and furthermore they may be repeated.</li>
<li>In 2008 the <span class="caps">SNSP</span> gave <a href="http://blog.diegovalle.net/2010/10/2009-homicide-data-for-chihuahua-has.html">incomplete data</a> to the <span class="caps">ICESI</span> for the state of Chihuahua, then when the data for <a href="http://blog.diegovalle.net/2010/07/mystery-solved-discrepancy-in-homicide.html">2009 was released</a>, they <a href="http://blog.diegovalle.net/2010/06/police-records-for-2009-are-out.html">updated the 2008 data</a> and again gave incomplete data for Chihuahua. It’s understandable that there would be some delay given the incredible rise in homicides, but it’s kind of fishy to forget to mention how incomplete it was. Sadly, the incomplete data was used for the homicide rates calculated by the <span class="caps">UN</span>. </li>
<li>
In the state of Tamaulipas, during August 2010, Mexican marines found
the dead bodies of 72 persons inside a ranch. The victims were
immigrants from Central and South America, presumably killed by the Zetas. Yet
during the month of August there were less than 70 homicides in all of Tamaulipas.</li>
<li>The State of Mexico had 3 months without homicides at the end of 1998 </li>
<li>Starting January 2007 the number of homicides in the state of Mexico <a href="/images/blogger_images/i.imgur.com_IyRSe.png">dropped by half from one month to the next</a>.</li>
<li>In 1997, Yucatán, Aguascalientes, and Querétaro had incredibly high homicide rates, there’s probably an error in database. </li>
<li>The number of homicides in <a href="/images/blogger_images/4.bp.blogspot.com__q3Caf3YFFAs_S_8Hy_Wmt9I_AAAAAAAAEH8_puj4xbzvhYY_s1600_INEGI-SNSP-dif.png">Tlaxcala before 2007</a> seem way to high. </li>
<li>According to the database there were no homicides in Tlaxcala during
2007. However the General Secretary of the state verbally reported to
the <span class="caps">ICESI</span> that there were 42 homicides in 2007. Also in Tlaxcala, during
2006, there were an anomalously high number of kidnappings, probably
the result of another error unless the smallest state in Mexico
accounted for over 40% of the kidnappings in the entire country.
</li>
<li>There’s no data on homicides by firearm in the states of Baja California, Oaxaca and Tabasco, and it fluctuates widly in Guererro, and Jalisco.<br/><div style="text-align: center;">
<a href="/images/blogger_images/4.bp.blogspot.com__q3Caf3YFFAs_TVIBwZKl6sI_AAAAAAAAEpI_pkNnhbQvRME_s1600_sm-firearm-chihuahua.png" ><img border="0" height="248" src="/images/blogger_images/4.bp.blogspot.com__q3Caf3YFFAs_TVIBwZKl6sI_AAAAAAAAEpI_pkNnhbQvRME_s320_sm-firearm-chihuahua.png" width="320"/></a></div>
</li>
<li>The proportion of homicides by firearm was incredibly low in Chihuahua in 2008. That was the year of the joint operation in Chihuahua and it <a href="http://blog.diegovalle.net/2010/09/how-expiration-of-assault-weapon-ban.html">doesn’t match</a> the data from the <span class="caps">INEGI</span>. Basically the firearm data from the <span class="caps">SNSP</span> is useless unless you make lots of adjustments to it.<br/><div style="text-align: center;">
<a href="/images/blogger_images/2.bp.blogspot.com__q3Caf3YFFAs_TVIBql4anVI_AAAAAAAAEoo_uUOa8XVR37g_s1600_firearm-chihuahua.png" ><img border="0" height="228" src="/images/blogger_images/2.bp.blogspot.com__q3Caf3YFFAs_TVIBql4anVI_AAAAAAAAEoo_uUOa8XVR37g_s320_firearm-chihuahua.png" width="320"/></a></div>
</li>
<li>It looks like the different states send a bunch of excel files to the <span class="caps">SNSP</span> which then tallies them. With such an outdated way of collecting data it’s no surprise the database is plagued with mistakes.</li>
</ul>
<i><b>Pros</b></i><br/>
<ul>
<li>It is constantly updated and the data is only a couple of months out of date. Although the <span class="caps">SNSP</span> hasn’t updated it’s <a href="http://www.secretariadoejecutivosnsp.gob.mx/es/SecretariadoEjecutivo/Incidencia_Delictiva_Nacional_fuero_comun">online download tool since September</a> they have updated the pdfs with the crime data. <a href="http://scraperwiki.com/scrapers/homicides-in-mexico-1997-2008/">(Here’s a scrapper</a> written in python to extract homicide data from the pdfs)</li>
</ul>
<br/>
<ul>
</ul>
<h4>
Homicide data from the <span class="caps">INEGI</span></h4>
This data is based on death certificates compiled by the Mexican government.<br/>
<br/>
<i><b>Cons</b></i><br/>
<ul>
<li> The deaths of the Acteal Massacre in 1997 were <a href="http://blog.diegovalle.net/2010/12/some-problems-with-mexican-mortality.html">registered as accidents instead of homicides</a>, and certified by a forensic doctor (“medico legista”) in Tuxtla Gutiérrez to boot. However, I did check some of the recent massacres and they were all in the database.</li>
<li>It takes a while for all the death certificates to be tallied and the data is usually more than a year out of date </li>
<li>The cutoff date of December 31 means the homicides for the last available year are under-counted by 4% (25% for the last month of the year)</li>
<li>There’s a weird <a href="http://blog.diegovalle.net/2010/12/some-problems-with-mexican-mortality.html">pattern of lesions of undetermined intent</a> in Ciudad Juárez in 2007</li>
<li>In 2008 most newspapers reported the number of deaths in Ciudad Juárez as 1650, the number in database is 1610, the discrepancy is probably because newspapers tend to report not only the number homicides in Juárez but also in the adjacent municipalities. In 2009 there were 2,316 homicides in the database (about 2,375 taking into account the undercount), but press reports placed the number of homicides as close to 2,700.</li>
</ul>
<i><b>Pros</b></i><br/>
<ul>
<li>This is the best record of homicides in Mexico. If you download the mortality database from <span class="caps">SINAIS</span> you’ll get a daily record for all deaths in Mexico at the locality and municipality levels. You can also find out how many people <a href="http://blog.diegovalle.net/2010/09/how-expiration-of-assault-weapon-ban.html">died from firearms</a>, poisoning, knife wounds, etc.</li>
</ul>
<br/>
<ul>
</ul>
<ul>
</ul>
<h4>
Execution tallies by the newspapers<i> Milenio</i> and <i>Reforma </i></h4>
<i><b>Cons</b></i><br/>
<ul>
<li>Given the low prosecution rates it is not surprising that the series between <i>Reforma</i> and <i>Milenio</i> differ. However the difference between the series should be random and starting in June 2009 <i>Reforma</i> shows a precipitous drop only to go back up again. Given that <a href="http://blog.diegovalle.net/2010/07/mystery-solved-discrepancy-in-homicide.html">Chihuahua accounted for most of the drop</a> it looks like Reforma missed some narco-executions.
<div style="text-align: center;">
<a href="/images/blogger_images/2.bp.blogspot.com__q3Caf3YFFAs_TVIBvXwgN1I_AAAAAAAAEpE_4gfB4GIy6O8_s1600_reforma-vs-milenio.png" ><img border="0" height="228" src="/images/blogger_images/2.bp.blogspot.com__q3Caf3YFFAs_TVIBvXwgN1I_AAAAAAAAEpE_4gfB4GIy6O8_s320_reforma-vs-milenio.png" width="320"/></a></div>
I am thankful for Reforma’s steadfast devotion to the task of tallying the homicides week by week, it matters so that people know what is happening and the government can’t hide the magnitude of this tragedy. I’ve used <a href="http://blog.diegovalle.net/2010/06/statistical-analysis-and-visualization.html">their data before</a>, but something went wrong in the state of Chihuahua in 2009.</li>
<li><i>Milenio</i> went from counting more drug war related homicides than the government to fewer.</li>
</ul>
<div style="text-align: center;">
<a href="/images/blogger_images/2.bp.blogspot.com__q3Caf3YFFAs_TVIBuyftfCI_AAAAAAAAEpA_ldhoone0kSI_s1600_milenio-vs-drh.png" ><img border="0" height="228" src="/images/blogger_images/2.bp.blogspot.com__q3Caf3YFFAs_TVIBuyftfCI_AAAAAAAAEpA_ldhoone0kSI_s320_milenio-vs-drh.png" width="320"/></a> </div>
<i><b>Pros</b></i><br/>
<ol>
</ol>
<ul>
<li>Constantly updated</li>
<li>Data from <i>Reforma</i> is available by week and at the state level</li>
</ul>
<ol>
</ol>
<br/>
<h4>
Crimes presumed to be linked with organized crime (drug war related murders)</h4>
This data is based on police reports, but filtered to only include those deaths presumed to be linked with organized crime (drug cartels) and exclude duplicate reports.<br/>
<br/>
<i><b>Cons</b></i><br/>
<ul>
<li>The government missed the incredible rise in murders in Ciudad Juárez right before the army arrived. There’s also a discrepancy with the Baja California homicide data.</li>
</ul>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/1.bp.blogspot.com__q3Caf3YFFAs_TVITVOTOS8I_AAAAAAAAEpM_PGChpyOfpMU_s1600_juarez.png" imageanchor="1" ><img border="0" height="228" src="/images/blogger_images/1.bp.blogspot.com__q3Caf3YFFAs_TVITVOTOS8I_AAAAAAAAEpM_PGChpyOfpMU_s320_juarez.png" width="320"/></a></div>
<ul>
<li>The government uses a definition of what constitutes an execution at odds with what the newspapers define to be an execution (all drug war related deaths)</li>
<li>Since the database contains the number of deaths and not police reports like the <span class="caps">SNSP</span> homicide data, we can compare it to the <span class="caps">INEGI</span>. A higher number of drug-related homicides than total homicides would indicate a serious problem with at least one of the databases:
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/4.bp.blogspot.com__q3Caf3YFFAs_TVK65KW6OZI_AAAAAAAAEpc_p5Fji8Jnefs_s1600_drh-vs-inegi-recent.png" imageanchor="1" ><img border="0" height="224" src="/images/blogger_images/4.bp.blogspot.com__q3Caf3YFFAs_TVK65KW6OZI_AAAAAAAAEpc_p5Fji8Jnefs_s320_drh-vs-inegi-recent.png" width="320"/></a></div>
<br/>
<div style="text-align: center;">
</div>
There’s a big discrepancy in the state of Sinaloa. The differences in other states are small and may be due to the deaths being ordered by date of registration instead of occurrence. I also only compared by state since it is unclear in the drug related homicide datase whether the municipality refers to the place where the murder took place or where it was registered.</li>
<li>If you add the values for all the aggressions, shootouts, and executions and compare them with the precomputed totals in the database, the values for April 2009 (Acaponeta) and October 2010 (Manzanillo) are off by one, which speaks volumes about the care with which it was compiled.<br/>
</li>
<li>We can also compare the drug-related database to the <span class="caps">SNSP</span> homicide data. As I suspected the number of <a href="http://blog.diegovalle.net/2010/12/recent-developments-in-drug-war.html">homicides in Chihuahua and Baja California</a> was underreported: </li>
</ul>
<div style="text-align: center;">
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/4.bp.blogspot.com__q3Caf3YFFAs_TVK7AOMzbuI_AAAAAAAAEpg_9xgHjCD4WQo_s1600_drh-vs-snsp-recent.png" imageanchor="1" ><img border="0" height="224" src="/images/blogger_images/4.bp.blogspot.com__q3Caf3YFFAs_TVK7AOMzbuI_AAAAAAAAEpg_9xgHjCD4WQo_s320_drh-vs-snsp-recent.png" width="320"/></a></div>
</div>
<br/>
<i><b>Pros</b></i><br/>
<div class="separator" style="clear: both; text-align: center;">
<a href="/images/blogger_images/4.bp.blogspot.com__q3Caf3YFFAs_TVIYefSzRTI_AAAAAAAAEpQ_a7JG5BaOILU_s1600_drug-non.png" imageanchor="1" ><img border="0" height="177" src="/images/blogger_images/4.bp.blogspot.com__q3Caf3YFFAs_TVIYefSzRTI_AAAAAAAAEpQ_a7JG5BaOILU_s320_drug-non.png" width="320"/></a></div>
<ul>
<li>The database is divided into executions, shootouts, and aggressions against the government. Although the definitions seem somewhat dubious: Shootouts can start as aggressions. And any homicide by firearm may be counted as an execution.</li>
<li>Contains recent data </li>
</ul>
<br/>
Since I’ve suspect that homicides by firearm account for a big chunk of drug-related homicides I decided to compare the drug-related homicides in the most violent states (excluding Sinaloa because of the discrepancy) with the data from the <span class="caps">INEGI</span>:<br/>
<div style="text-align: center;">
<a href="/images/blogger_images/3.bp.blogspot.com__q3Caf3YFFAs_TVIBrfrRS8I_AAAAAAAAEos_QmEIp0vQItw_s1600_inegi.firearm-vs-drh.png" ><img border="0" height="240" src="/images/blogger_images/3.bp.blogspot.com__q3Caf3YFFAs_TVIBrfrRS8I_AAAAAAAAEos_QmEIp0vQItw_s320_inegi.firearm-vs-drh.png" width="320"/></a></div>
The relationships between firearm homicides and drug-related homicides look linear…<br/>
<div style="text-align: center;">
<a href="/images/blogger_images/2.bp.blogspot.com__q3Caf3YFFAs_TVIBscKdi0I_AAAAAAAAEow_O-pfC_7Zw_M_s1600_inegi.firearm-vs-drh-correlation.png" ><img border="0" height="240" src="/images/blogger_images/2.bp.blogspot.com__q3Caf3YFFAs_TVIBscKdi0I_AAAAAAAAEow_O-pfC_7Zw_M_s320_inegi.firearm-vs-drh-correlation.png" width="320"/></a></div>
except for Baja California, but that will be the topic of my next post…<br/>
<br/>
<div style="clear: both;"></div>
</div>