Fabrizzio:
Here are some observations that we could use for the discussion part:
- We had to downscale the values of many variables so many variables had acutally a province level resolution. For some districts were there was an important difference between the predicted and the calculated values, I saw that the only variable that changed significantly was the 'elec' variable. The 'health_dis' variable is also calculated on a district level resolution, but this variable is more 'zonal', values along neighboring districts tend to have similar values.
- Some variables have very weird values, for example, the 'elec' variable has districts with values 0% electricity (very weird) or with values less than 1%, also very weird!
- Most of the districts were there was an over or under estimation were 'capital' districts. I mean not only captial of the country but also capital districts of the province they are from. This is the case of the Krong Kracheh district, it is the one located in red in Fig 1, very small and to the east, that is the capital of the Kracheh province.
- It is very likely that since we downscaled from province level to district level, there is a trend to have 'more errors' in the prediction for the smallest of the districts. This has sense because, if we give an average value over a territory of 100 m2 (let's say), that has a lot of rural areas and some cities, to a small district that is only 5 m2 and that is very urban (like the capitals), this average value will be quite different from the actual value in that small district. On the other hand, a bigger district will have a 'real' value that is likely more similar to this average value. So our downscaling gives likely wronged values in many cases, specially for the smaller districts.
- I don't understand at all, why there is a negative correlation between neo_mort and ALL of the variables. This is very very weird because it means, for example, that when there is more domestic violence, there will be less cases of neo_mortality. It also means that with more distance to the hospitals, there will be less cases of neo_natal mortality. This is obviously a non sense, so I'd be tempted to say that since our prediction was not really that good (R2 of 0.21) and since we didn't have the best data at our disposition, etc, the linear model that we produced, does predictions only out of 'luck' or 'randomness', and that is not a real explicative model. Or that in any case, our results would need to be studied more in detail, with better data (with no errors and in the correct resolution), with some other better explicative variables (like for example adding an economical predictor, like average income per district), to validate or refute our results. (or did I understood wrongly the negative value of the coefficients??)